Towards Adaptive, Efficient and Scalable Text-to-Video Retrieval System

The School of EECS is hosting the following HDR Progress Review 1 Confirmation Seminar:

Towards Adaptive, Efficient and Scalable Text-to-Video Retrieval System

Speaker: Zecheng Zhao
Chair: Assoc. Pro. Sen Wang

Abstract: The explosive growth of video platforms, with millions of new uploads daily, has made it increasingly difficult for users to find relevant content. Text-to-Video Retrieval (TVR), which locates videos based on natural language queries, has thus become a critical capability for these platforms. While current methods perform well on static benchmarks, real-world deployment exposes three critical gaps: video content evolves continuously, corpus sizes grow beyond tractable search, and training data is often noisy. To address these gaps, we first enable retrieval models to incrementally learn new videos without forgetting previously learned content. Second, we formalize a generative recall then dense reranking paradigm, achieving improvements in speed and storage efficiency. Third, we investigate whether synthetic videos can serve as effective training data, identifying the key quality factors that determine their utility. The goal of this research is to build a unified TVR system that can simultaneously adapt, scale, and learn efficiently in real-world scenarios.

Bio: Zecheng Zhao is a PhD student in the Data Science group at the School of Electrical Engineering and Computer Science, The University of Queensland (UQ), Australia. He received both his Bachelor's and Master's degrees in Computer Science from UQ. His research focuses on text-to-video retrieval, continual learning, and generative retrieval, under the supervision of Professor Helen Huang, Associate Professor Rocky Chen, and Dr. Yadan Luo.

About Data Science Seminar

This seminar series is hosted by EECS Data Science.

Venue

Room: 78-411
Zoom Link: https://uqz.zoom.us/j/4342750927