About Me πͺͺ
I’m currently a Machine Learning Engineer at ByteDance, mainly on duty with research and development of Vision-Language Models for e-commerce safety. I received my Master of Science in Engineering in June 2024 at MCG Group, Department of Computer Science and Technology, Nanjing University, under the supervision of Assoc. Prof. Jie Tang. I also received my Bachelor of Science in Computer Science and Technology from Nanjing University in June 2021.
My research interests include Computer Vision, Multimodal Deep Learning and Generative Deep Learning, recently lie in Visual Object Tracking (VOT), Vision-Language Models and Generative Models.
News π₯
- [ 2025.09.19 ] π MERIT is accepted by NeurIPS 2025! Code and Dataset are available now.
- [ 2025.06.12 ] π€ We propose MERIT, the first multilingual dataset for interleaved multi-condition semantic retrieval, comprising 320,000 queries with 135,000 products in 5 languages while covering 7 distinct product categories. Meanwhile, a novel fine-tuning framework named Coral is constructed to adapt pre-trained MLLMs for embedding extraction. arXiv and Project Page are available now.
- [ 2024.03.21 ] π A Zhihu Blog is published to explain main ideas of the paper.
- [ 2023.10.18 ] π Both CVF and arXiv version of ROMTrack are updated! This is a tracker utilizing the newly proposed object modeling paradigm, significantly improving robustness. Code is available now.
- [ 2023.07.14 ] π Good News! One paper, abbreviated as ROMTrack, is accepted by ICCV 2023.
Publications π
- MERIT: Multilingual Semantic Retrieval with Interleaved Multi-Condition Query
Wei Chow, Yuan Gao, Linfeng Li, Xian Wang, Qi Xu, Hang Song, Lingdong Kong, Ran Zhou, Yi Zeng, Yidong Cai, Botian Jiang, Shilin Xu, Jiajun Zhang, Minghui Qiu, Xiangtai Li, Tianshu Yang, Siliang Tang, Juncheng Li.
β‘οΈ The 39th Annual Conference on Neural Information Processing Systems (NeurIPS), Datasets and Benchmarks, 2025. - Robust Object Modeling for Visual Tracking
Yidong Cai, Jie Liu, Jie Tang, Gangshan Wu.
β‘οΈ The 19th IEEE/CVF International Conference on Computer Vision (ICCV), 2023.
Academic Services πΌ
- Journal Review :
- IEEE Transactions on Circuits and Systems for Video Technology (TCSVT)
- IEEE Transactions on Multimedia (TMM)
- IEEE Transactions on Neural Networks and Learning Systems (TNNLS)
- ACM Transactions on Multimedia Computing, Communications and Applications (TOMM)
- Journal of Visual Communication and Image Representation (JVCIR)
- Conference Review :
- IEEE International Conference on Computer Vision (ICCV)
- Teaching Assistant :
- Introduction to Computer System (ICS)
- Multimedia Technology
Educations π
- 2021.9 - 2024.6: M.Sc., Nanjing University, Nanjing.
- Department of Computer Science and Technology.
- MCG Group, supervised by Assoc. Prof. Jie Tang, Prof. Liming Wang and Prof. Gangshan Wu.
- 2017.9 - 2021.6: B.Sc., Nanjing University, Nanjing.
- Department of Computer Science and Technology.
- 2020.9 - 2021.6: Research on Visual Object Tracking, supervised by Prof. Liming Wang.
- 2012.9 - 2017.6: Tianyi High School, Jiangsu.
- Both junior school and senior school.
Experiences π₯οΈ
- 2024.7 - Present: Machine Learning Engineer (MLE) - Multimodal.
- Governance and Experience, Global E-commerce, Data, ByteDance, Shanghai.
- Mainly focus on the research and development of Vision-Language Models for e-commerce safety.
- 2023.6 - 2023.9: Machine Learning Engineer (MLE) Intern - Computer Vision.
- Alimama, Taobao & Tmall Group, Alibaba Group, Hangzhou.
- Mainly focus on the research and development of Multimodal & AIGC algorithms.
Honors and Awards π
- Outstanding Graduate Student of Nanjing University, 2024.
- Tencent Scholarship, 2024.
- Academic Scholarship of Nanjing University,
- 2021 (First Prize) & 2022 (Second Prize) & 2023 (Second Prize).
- People’s Scholarship of Nanjing University, 2018 & 2019 & 2020.
- 2018 (Second Prize) & 2019 (First Prize) & 2020 (Second Prize).
- Third Prize in Jiangsu Mathematical Modeling Competition, 2019.
- Silver Medal in 12th China Southeast Mathematical Olympiad, 2015.
Contact π«
- Email:
- Gmail: dawnyc1123@gmail.com
- Edu-mail: yidong_cai@smail.nju.edu.cn