Abstract Processing Pipeline (SVM) Smartphone Application · 3. Ilker Ersoy, 4. Andrew Powell, 5....

1
Smartphone Application Stefan Jaeger, 1* Kamolrat Silamut, 2 Hang Yu, 1 Mahdieh Poostchi, 3 Ilker Ersoy, 4 Andrew Powell, 5 Zhaohui Liang, 6 Md Amir Hossain, 7 Sameer Antani, 1 Kannappan Palaniappan, 3 Richard J Maude, 2,8 George Thoma 1 Processing Pipeline (SVM) Abstract Reducing the Diagnostic Burden of Malaria Using Microscopy Image Analysis and Machine Learning in the Field Microscopy remains the main technique for diagnosing malaria, despite the availability of Rapid Diagnostic Tests. Hundreds of millions of blood films are examined using microscopy every year for diagnosing malaria and quantifying parasite burdens. Processing this large number of slides consumes scarce resources. Microscopy technicians who read these slides in the field may be inadequately trained or overwhelmed with the volume of slides to process, leading to missed and incorrect diagnoses. To ease the burden for microscopists and improve diagnostic and quantitative accuracy, we have developed a smartphone application that can assist field microscopists in diagnosis of malaria. The software runs on a standard Android smartphone that is attached to a microscope by a low cost adapter. Images of thin-film microscope slides are acquired through the eyepiece of the microscope using the smartphone’s built-in camera. The smartphone application assists microscopist in detecting parasites and estimating the parasitaemia. For each microscope field, the image processing software identifies infected and uninfected cells, and reports the parasite count per microliter of blood. The software was trained with more than 200,000 red blood cells from slides acquired at Chittagong Medical College Hospital in Bangladesh from patients with and without P. falciparum infection. These were manually annotated by an experienced professional slide reader. This is one of the largest labeled malaria slide image collections, enabling the application of new machine learning techniques such as deep learning. For each field-of-view image taken, an image processing pipeline is applied first to detect and segment cells before computing color and texture features for automatic machine classification to discriminate between infected and uninfected cells and other objects in the slide. Initial experiments show that our software correlates highly with human experts and flow cytometry. Acknowledgments 1 Lister Hill National Center for Biomedical Communications, U.S. National Library of Medicine, National Institutes of Health, USA 2 Mahidol-Oxford Tropical Medicine Research Unit, Mahidol University, Bangkok, Thailand 3 Computer Science Department, University of Missouri, Columbia, MO 65211, USA 4 School of Medicine, University of Missouri, Columbia, MO 65212, USA 5 Computer Science Department, Swarthmore College, Swarthmore, PA 19081, USA 6 School of Information Technology, York University, Toronto, ON, M3J1P3, Canada 7 Chittagong Medical College Hospital, Chittagong, Bangladesh 8 Nuffield Department of Clinical Medicine, University of Oxford, Oxford OX3 7FZ, UK * Contact: [email protected] This research is supported by the HHS Ventures Fund and the Intramural Research Program of NIH, NLM, and Lister Hill National Center for Biomedical Communications. We would like to acknowledge all the patients in Bangladesh. Mahidol- Oxford Tropical Medicine Research Unit is funded by the Wellcome Trust of Great Britain. Images Preprocessing Cell Segmentation Feature Computation for Each Cell Classification with Support Vector Machine (SVM) Cell Counting Results Level-Set Cell Segmentation Training Data Firefly annotation tool (firefly.cs.missouri.edu) 150 infected patients, 50 healthy patients 2500 slide images 250,000 manually annotated red blood cells Deep Learning with CNN Measure CNN Model Transfer Learning Accuracy 97.37 91.99 Sensitivity 96.99 89.00 Specificity 97.75 94.98 Precision 97.73 95.12 F1 Score 97.36 90.24 Matthews correlation coeficient 94.75 85.25 Single Cell Classification Performance (CNN) Pipeline Performance per Slide (SVM) Ground Truth vs Automatic Infection Ratio* ( nRGB & LBP Features; avg. diff. = 0.017) Automatic Infection Ratio Convolutional Neural Network (CNN): *Infection Ratio = Number of infected red blood cells Total number of red blood cells Ground Truth Infection Ratio

Transcript of Abstract Processing Pipeline (SVM) Smartphone Application · 3. Ilker Ersoy, 4. Andrew Powell, 5....

Page 1: Abstract Processing Pipeline (SVM) Smartphone Application · 3. Ilker Ersoy, 4. Andrew Powell, 5. Zhaohui Liang, 6 . Md Amir Hossain, 7. Sameer Antani, 1. Kannappan Palaniappan, 3.

Smartphone Application

Stefan Jaeger,1* Kamolrat Silamut,2 Hang Yu,1 Mahdieh Poostchi,3 Ilker Ersoy,4 Andrew Powell,5 Zhaohui Liang,6

Md Amir Hossain,7 Sameer Antani,1 Kannappan Palaniappan,3 Richard J Maude,2,8 George Thoma1

Processing Pipeline (SVM)Abstract

Reducing the Diagnostic Burden of Malaria Using Microscopy Image Analysis and Machine Learning in the Field

Microscopy remains the main technique for diagnosingmalaria, despite the availability of Rapid DiagnosticTests. Hundreds of millions of blood films areexamined using microscopy every year for diagnosingmalaria and quantifying parasite burdens. Processingthis large number of slides consumes scarceresources. Microscopy technicians who read theseslides in the field may be inadequately trained oroverwhelmed with the volume of slides to process,leading to missed and incorrect diagnoses. To easethe burden for microscopists and improve diagnosticand quantitative accuracy, we have developed asmartphone application that can assist fieldmicroscopists in diagnosis of malaria. The softwareruns on a standard Android smartphone that isattached to a microscope by a low cost adapter.Images of thin-film microscope slides are acquiredthrough the eyepiece of the microscope using thesmartphone’s built-in camera. The smartphoneapplication assists microscopist in detecting parasitesand estimating the parasitaemia. For each microscopefield, the image processing software identifies infectedand uninfected cells, and reports the parasite countper microliter of blood. The software was trained withmore than 200,000 red blood cells from slidesacquired at Chittagong Medical College Hospital inBangladesh from patients with and without P.falciparum infection. These were manually annotatedby an experienced professional slide reader. This isone of the largest labeled malaria slide imagecollections, enabling the application of new machinelearning techniques such as deep learning. For eachfield-of-view image taken, an image processingpipeline is applied first to detect and segment cellsbefore computing color and texture features forautomatic machine classification to discriminatebetween infected and uninfected cells and otherobjects in the slide. Initial experiments show that oursoftware correlates highly with human experts and flowcytometry.

Acknowledgments

1Lister Hill National Center for BiomedicalCommunications, U.S. National Library of Medicine,National Institutes of Health, USA

2Mahidol-Oxford Tropical Medicine Research Unit,Mahidol University, Bangkok, Thailand

3Computer Science Department, University ofMissouri, Columbia, MO 65211, USA

4School of Medicine, University of Missouri, Columbia,MO 65212, USA

5Computer Science Department, Swarthmore College, Swarthmore, PA 19081, USA

6School of Information Technology, York University, Toronto, ON, M3J1P3, Canada

7Chittagong Medical College Hospital, Chittagong, Bangladesh

8Nuffield Department of Clinical Medicine, University of Oxford, Oxford OX3 7FZ, UK* Contact: [email protected]

This research is supported by the HHS Ventures Fund and theIntramural Research Program of NIH, NLM, and Lister HillNational Center for Biomedical Communications. We wouldlike to acknowledge all the patients in Bangladesh. Mahidol-Oxford Tropical Medicine Research Unit is funded by theWellcome Trust of Great Britain.

Images

Preprocessing

Cell Segmentation

Feature Computation for Each Cell

Classification withSupport Vector Machine (SVM)

Cell Counting

Results

Level-Set Cell Segmentation

Training Data

Firefly annotation tool (firefly.cs.missouri.edu) 150 infected patients, 50 healthy patients 2500 slide images 250,000 manually annotated red blood cells

Deep Learning with CNN

Measure CNN Model Transfer LearningAccuracy 97.37 91.99Sensitivity 96.99 89.00Specificity 97.75 94.98Precision 97.73 95.12F1 Score 97.36 90.24Matthews

correlation coeficient 94.75 85.25

Single Cell Classification Performance (CNN)

Pipeline Performance per Slide (SVM)Ground Truth vs Automatic Infection Ratio*( nRGB & LBP Features; avg. diff. = 0.017)

Aut

omat

ic In

fect

ion

Rat

io

Convolutional Neural Network (CNN):

*Infection Ratio =Number of infected red blood cells

Total number of red blood cells

Ground Truth Infection Ratio