Efficient and Elastic Large Models

The Data Science Discipline of the School of EECS is hosting the following guest seminar:

Efficient and elastic Large Models

Speaker: Dr Prateek Jain, Google Research India
Hosts: Dr. Mahsa Baktashmotlagh & Prof Guido Zuccon

Abstract: Generative LLMs are transforming multiple industries and have proven to be robust for multitude of use cases across industries and settings. One of the key impediments to their widespread deployment is the cost of serving and its deployability across multiple devices/settings. In this talk, we will discuss the key challenges in improving efficiency of LLM serving. We will then give an overview of multiple techniques to address the problem. In particular, we will discuss tandem transformers and HIRE, techniques to speed up decoding in LLMs.

Bio: Prateek Jain is a Principal Scientist at Google Research India where he is also the director of Machine Learning and Optimization. He obtained his doctorate from UT Austin and BTech from IIT-Kanpur. He has conducted foundational research in the areas of efficient and elastic large models as well as in large-scale and non-convex optimization. Prateek regularly serves on the senior PC of top ML conferences and is on the editorial board of top ML journals including JMLR, SIMODS. He has also won multiple best paper awards including the 2020 Best Paper by IEEE Signal Processing Society. Prateek also received the Young Alumnus Award from IIT Kanpur in 2021 and the ACM India Early Career Researcher Award in 2022.

About Data Science Seminar

This seminar series is hosted by EECS Data Science.

Venue

Room 50-N201 - Hawken Engineering Building