Model Selection and Model Generalizability of Deep Neural Networks
The School of EECS is hosting the following PhD Progress Review 3 Seminar
Model Selection and Model Generalizability of Deep Neural Networks
Speaker: Ekaterina Khramtsova
Host: Dr. Mahsa Baktashmotlagh
Abstract: One of the limitations of applying machine learning methods in real-world scenarios is the existence of a domain shift between the source (i.e., training) and target (i.e., test) datasets, which typically entails a significant performance drop. This is further complicated by the lack of annotated data in the target domain, making it impossible to quantitatively assess the model performance or to analyse the model behaviour on out-of-distribution (OOD) datasets.
In this research, we explore various methods for assessing the generalisability of Deep Neural Networks on OOD samples within different scenarios, namely for model selection and for performance estimation.
We first consider the task of model selection for image classification, where given a pool of models trained on a source dataset, the goal is to rank the models according to their performance on target datasets. We propose a novel method based solely on the topology of the model's weight distribution, without access to either training or test data. Our proposed approach offers a means to examine the influence of diverse training methods on the network weight space, illuminating the behavior of neural networks and contributing to a more insightful model selection process. Next, we explore the same task of model selection for Information Retrieval models; we adapt the current state-of-the-art methods from computer vision and machine learning, outline their limitations and summarise the main challenges in their adaptation to IR task.
The second challenge we address is performance prediction, where given a model, the objective is to predict its performance on a target dataset. Our experiments reveal the existence of a correlation between the performance of the model and the degree of the network change when it is trained on a target dataset using a self-supervised loss. We further demonstrate how this correlation can be learned by utilizing a meta-set, which is constructed by augmenting the source data . This approach builds on the intuition that target data close to the source domain will produce more confident predictions, thus leading to small weight changes during fine-tuning.
Biography: Ekaterina Khramtsova received her B.Sc. degree in Software Engineering from Peter the Great St. Petersburg Polytechnic University, Russia, and her M.Sc. degree in Computer Science from the University of Luxembourg, Luxembourg. She is currently a PhD candidate in the School of Electrical Engineering and Computer Science, University of Queensland, Australia under the supervision of Dr. Mahsa Baktashmotlagh and P. Guido Zuccon. Her research interests are model generalizability, topology and persistent homology.
About Data Science Seminar
This seminar series is hosted by EECS Data Science.