Integrating Text and Audio Signals for Efficient and Effective Podcast Search

The School of EECS is hosting the following seminar:

Integrating Text and Audio Signals for Efficient and Effective Podcast Search

Speaker: Watheq Mansour
Host: Joel Mackenzie

Abstract: Online podcasts are rapidly expanding as a popular medium of spoken audio, with more than 98 million episodes hosted in Apple Podcast. Searching this gigantic corpus involves multiple challenges ranging from content and style diversity to expensive audio processing to variable length. In this thesis, we aim to address these challenges and devise novel approaches to improve state-of-the-art performance. We approach the task by modelling it to either text search or audio search. In text search, we apply text search techniques to the transcribed podcast content. As in audio search, we process the audio directly by using multi-modal models such as transformers. In the first year of this journey, we studied a recent and powerful document/passage expansion technique since it is a generic method that can be applied to both text and audio content. In addition, we studied the podcast search task, reviewed the literature, and conducted some preliminary experiments.

About Data Science Seminar

This seminar series is hosted by EECS Data Science.

Venue

Room: 78-346 - GP South level 3 room 346