The School of EECS is hosting the following PhD Progress Review 1 Seminar

AI-driven Approaches for Effective Systematic Review Literature Screening

Speaker: Xinyu Mao
Host: Dr Joel Mackenzie

Abstract: A systematic review (SR) is a type of literature review that appraises and syntheses primary studies as evidence, in answering specific research questions. Despite its importance to clinical treatment and public policy making, conducting a systematic review is time-consuming and labour-intensive. A systematic review typically spans more than 2 years to complete with over 1,100 hours of human effort and costs more than $350K. This work aims to minimise the costs associated with the process of creating systematic reviews.

Most of the costs originate in the lengthy screening phase, as a vast amount of candidate studies are retrieved for human reviewers to access their relevance to the research topic. The inclusion rate of these studies, however, is extremely low, with merely around 2.9% of the candidate studies contributing to the conclusion on average. This PhD will focus on the screening phase and leverage recent AI techniques including Pre-trained Language Models. We then develop the following two main research directions:

1. Neural methods for systematic review literature ranking. To save the time and effort of screening a large portion of irrelevant studies, traditional machine learning methods are adopted and dominate academic research and commercial products. Recently, Pre-trained language models (PLMs), however, have shown their advantages over the traditional models in (1) free of feature engineering (2) learning semantic meaning from context (3) requiring less or without label for training. In our work, we will apply neural methods to improve the effectiveness of screening in ranking relevant studies earlier to the reviewer.

2. Prediction methods for systematic review screening. Aside from the low precision of the candidate studies to the research question of the systematic review, the cost of screening originates from human involvement in assessing the studies. In this direction, we will consider developing prediction methods to reduce unnecessary effort in these aspects of screening: (1) For judging the relevance of the study, prediction of relevancy can ease human effort in reviewing (2) For the reviewing process, prediction of screening target with stop criteria can reduce unnecessary effort from exhausting all the candidate studies (3) For the query, a performance prediction can reduce cost from screening candidate studies with low precision from the Boolean query.

Under these proposed research directions, our work so far has focused on the first direction and explored the pros and cons of using neural methods. We have replicated a recent study using BERT within an active learning workflow for screening and found it is computationally heavy and time-consuming. We then developed an efficient yet effective method utilising dense retrieval and explicit feedback from reviewers during screening. Our primary work has shown that AI-driven methods are promising in improving the heavy screening process.

Bio: Xinyu Mao is a PhD student from the School of EECS at the University of Queensland under the supervision of Prof. Guido Zuccon and A/Prof. Bevan Koopman. Xinyu received his B.E. in Mechanical Engineering and B.A. in English Literature from Shanghai Ocean University and his master’s degree in Data Science from the University of Queensland. His research interests include Information Retrieval, NLP, and Active Learning.

About Data Science Seminar

This seminar series is hosted by EECS Data Science.

Venue

Room 49-601 (Advanced Engineering Building) or Online Link: https://uqz.zoom.us/j/9347338038