The School of EECS is hosting the following PhD Thesis Review Seminar:
Effective and Secure Federated Online Learning to Rank
Speaker: Shuyi Wang
Host: Prof. Guido Zuccon
Abstract: Online Learning to Rank (OLTR) optimises ranking models using implicit user feedback, such as clicks, to directly manipulate search engine results in production. Unlike traditional Learning to Rank (LTR) methods that rely on a static set of training data with relevance judgements to learn a ranking model, OLTR methods update the model continually as new data arrives. Thus, it overcomes several drawbacks such as the high cost required by the editorial annotation from human judges, the potential misalignment between real users' preferred documents with the human annotators, or the issues associated with rapid changes of intents underlying queries. However, this process requires OLTR methods to collect searchable data, user queries and clicks; current methods are not suited to situations in which users want to maintain their privacy, i.e., not sharing data, queries and clicks.
Federated Online Learning to Rank (FOLTR), which implements OLTR under a Federated Learning (FL) scenario, has been proposed to provide a solution that can address the privacy issue in OLTR. Existing work has shown promise; however, FOLTR methods lag behind traditional OLTR, which has been studied in the centralised setting. In particular, challenges currently faced by FOLTR methods include low effectiveness, unclear robustness with respect to data distribution across clients, unclear susceptibility to attacks on the learning process and the rank model, and applicable methods for unlearning interactions and training data of certain clients from the ranking model.
In this thesis, we comprehensively investigate the aforementioned challenges and build effective, secure federated online learning to rank methods.
We begin by addressing the effectiveness of FOLTR, identifying the shortcomings of existing work and proposing an effective method (termed FPDGD). As previous work failed to address the bias from the user's implicit feedback, we leverage the Pairwise Differentiable Gradient Descent (PDGD) and adapt it to the Federated Averaging framework. Empirical evaluation shows our method significantly outperforms the previous method.
We evaluate the robustness of the current FOLTR system under non independent and identically distributed (non-IID) data over participating clients. We first enumerate four possible data distribution settings that may give rise to non-IID problems and then study the impact of each setting on the ranking performance, highlighting which data distribution may pose a problem for FOLTR methods.
We investigate both data and model poisoning attack strategies to showcase their impact on FOLTR search effectiveness. We also explore the effectiveness of robust aggregation rules for federated learning in countering these attacks, which contribute an understanding of the effect of attack and defense methods for FOLTR systems, as well as identifying the key factors influencing their effectiveness.
Lastly, we study how to remove the contribution made by a client in the FOLTR system by proposing an effective and efficient unlearning method and evaluate its effectiveness based on the idea of poisoning attacks.
In summary, this thesis delivers a comprehensive study on Federated Online Learning to Rank by tackling a range of unexplored areas on its effectiveness, robustness, and security, which expands the landscape of FOLTR.
Biography: Ms Shuyi Wang is a PhD candidate from the School of EECS under the supervision of Prof. Guido Zuccon and A/P Bevan Koopman. She received her B.S. degree from Nanjing Normal University and M.S. degree from Southeast University in China. Her research interests are Online Learning to Rank and Federated Learning.
About Data Science Seminar
This seminar series is hosted by EECS Data Science.