Abstract:
Evaluating model accuracy is an indispensable early lecture in machine learning courses. To do evaluation, we need a test set comprised of test samples and their ground truths. Whilst standard datasets (e.g., ImageNet) satisfy this requirement, many real world scenarios only provide us with unlabeled test data, rendering common model evaluation methods infeasible.
In this talk, I will introduce an important but under-explored problem, label-free model evaluation (AutoEval). Specifically, given a labeled training set and a model, we aim to estimate the model accuracy on unlabeled test datasets. Key to this problem is to design discriminative dataset representations: generally speaking, the more similar the test distribution is from the training one, the higher model accuracy we can expect. Focusing on the AutoEval problem, I will present our recent studies on how to describe a dataset. While AutoEval is challenging due to the complexity of test environment, we show that our systems have reasonable and very promising estimates of test set difficulty or model accuracy.
Bio:
Dr Liang Zheng is a Senior Lecturer, CS Futures Fellow and DECRA Fellow in the School of Computing, Australian National University. He is best known for his contributions in object re-identification, and his recent research interest is dataset-centric computer vision, where improving leveraging, analysing and improving data instead of algorithms are the of primary concern. He is a co-organiser of the AI City workshop series at CVPR and the first data-centric workshop at CVPR. He received his B.S degree (2010) and Ph.D degree (2015) from Tsinghua University, China.
Host:
Professor Helen Huang
This session will be conducted via Zoom: https://uqz.zoom.us/j/89362232168
About Data Science Seminar
This seminar series is hosted by EECS Data Science.