The Data Science Discipline of the School of EECS is hosting the following guest seminar:

Inference-Time Strategies for Ranking and RAG with LLMs

Speaker: Dr Honglei Zhuang (Google DeepMind)
Host: Prof Guido Zuccon

Abstract: While Large Language Models (LLMs) exhibit strong zero-shot capabilities, carefully designed inference-time strategies are crucial for unlocking their full potential. This talk delves into two tasks where this is particularly evident: text ranking and retrieval-augmented generation (RAG). In text ranking, we investigate inference-time prompting strategies to elicit relevance judgments and ranking preferences from LLMs, illustrating how to create effective zero-shot text rankers from LLMs. In RAG, we show that combining sophisticated inference-time strategies such as incorporating demonstrations and iterative prompting enables LLMs to better utilize long context windows, achieving better performance in the long-context regime beyond simply increasing the number of retrieved documents. We also develop a quantitative model to predict the optimal strategies based on a given context window budget.

Bio: Honglei Zhuang is a research scientist at Google DeepMind. His research interests include information retrieval, natural language processing, data mining and machine learning. He is particularly interested in building the next-generation technology to revolutionize how to access and leverage information with/for LLMs.

About Data Science Seminar

This seminar series is hosted by EECS Data Science.

Venue

Venue: Zoom: https://uqz.zoom.us/j/86807211342