Modelling Efficient and Robust Solutions for Microbiology Image Analysis Using Deep Learning

The School of EECS is hosting the following PhD progress review 3 seminar

Modelling Efficient and Robust Solutions for Microbiology Image Analysis Using Deep Learning

Speaker: Sarah Abdulaziz Alhammad (BCompSc, MCompSc)
Host: Prof Brian Lovell

Abstract: The interpretation of microscopic images is a crucial aspect of diagnosis in clinical microbiology laboratories. Highly skilled microbiologists (pathologists) are needed to interpret images such as Gram stain smears. These samples contain vital diagnostic information, such as bacteria presence and types, specimen quality and cell counting. The manual interpretation of conventional glass microscopy slides is still time-consuming, labour-intensive and operator dependent. With high-volume pathology laboratories, having an artificial intelligent system can be beneficial to alleviate limitations faced by conventional pathology at scale. such a system would ensure accuracy, reduce the workload of pathologists, and enhance both objectivity and efficiency. This has motivated the research of using data-driven techniques to develop automated interpretations of pathology images, in particular Gram stains.

With the vast development of computer vision techniques, it has become possible for researchers to perform Computer-Aided Diagnoses. After the emergence of deep learning, the analysis of pathology and medical images has transitioned from using traditional handcrafted features to deep learning algorithms. Convolutional Neural Networks (CNNs) are deep architectures that have the ability to learn features from the dataset itself which can improve the performance and make classifiers and detectors more robust to variations.

After reviewing the literature on pathology images, the automatic analysis of the Gram stain test using CNN has not gained the same amount of attention compared with other pathology tests such as Breast cancer, Lymphoma and Colorectal cancer. It is exceedingly rare to find datasets relating to the very important Gram stain, and this data scarcity has likely hindered research on Gram stain automation and limited research in this area. This thesis aims to apply deep learning techniques to analyse pathology images including gram stain data and discover techniques to improve analysis accuracy and efficiency in different aspects.

First, we proposed a CNN-based classifier for Gram-positive cocci bacteria subtypes in blood cultures. We studied the effect of downsampling, data augmentation, and image size on both classification accuracy and speed. Experiments were conducted on a novel dataset of three bacteria subtypes provided by Sullivan Nicolaides Pathology (SNP) comprising: Staphylococcus, Enterococcus and Streptococcus. The sub-images are obtained from blood culture WSIs captured by the in-house SNP MicroLab using a ×63 objective without coverslips or oil immersion. Our results show that a CNN-based classifier distinguishes between these bacteria subtypes with high classification accuracy.

Second, existing CNN classification backbones assume that testing classes are seen during training. However, it is sometimes impossible to collect instances of all bacteria subtypes during the model training phase. CNNs are incapable of estimating their own uncertainty and they assume full knowledge of the world. To avoid misdiagnosis risk in the bacteria classification task, we proposed OpenGram a framework to open CNN classifier that aims to tackle the problem of bacteria subtyping from an open-set perspective. Open-set recognition models have the ability to both classify known instances and detect unknown samples of novel classes. OpenGram combines a CNN classifier with a Gaussian mixtures model to adapt to open-set classification. The results demonstrate OpenGram’s ability to correctly detect unknown bacteria classes that were unseen by the network during training as well as classify known bacteria classes.

Third, deep learning-based object detection methods generally assume that large amounts of annotated training data are available and that both training and testing data are of the same feature space. Such assumptions are not always true in real-world applications. In the case of pathology images, collecting these annotations can be expensive and laborious. In addition, testing supervised models on different distributions can degrade detector performance as these models might not be properly generalized to other domains. We aim to tackle this lack of instance-level cell labels in Gram stain WSIs for the epithelial and leukocyte cell counting task. We presented, HybridGram, a framework with image translation and pseudo-labelling modules to completely avoid manual labelling on a new dataset. The results show that HybridGram can bridge the performance gap between fully supervised and unsupervised models.

About Data Science Seminar

This seminar series is hosted by EECS Data Science.

Venue

Zoom link: https://uqz.zoom.us/j/86483896908