Addressing Bias and Subjectivity in Machine Learning
Abstract: The success
of supervised machine learning algorithms rests on the assumption that data are drawn
from the same underlying distribution. However, this assumption is often violated in
real world applications where collected data involves human judgement. The contribution
of this thesis is a collection of approaches that address bias and subjectivity in real
world data. We illustrate ... read moreour work through three applications: predicting disease
progression in Multiple Sclerosis (MS) patients, detecting epileptogenic lesions in
focal cortical dysplasia (FCD) patients and selecting the best performing students in
the graduate admission process. In each of these applications, subjectivity and/or bias
manifest themselves in different ways. We present a total of four models each of which
takes on unique challenges associated with each task. In the MS research, we introduce
two models to estimate the prognosis for MS patients while addressing the patient bias
and physician subjectivity in the data: a classification model that predicts the MS
disease progression ('high' versus 'low'), and a regression model that forecasts the
actual MS severity scores. In the epilepsy research, we present a model that addresses
the paucity of features from MRI images and biases in the data originated from
inter-patient variability. Lastly, in the third application, we introduce a new variant
of SVM that exploits both labeled and unlabeled data and addresses the subjectivity
arising from the admission process.
Thesis (Ph.D.)--Tufts University, 2017.
Submitted to the Dept. of Computer Science.
Advisor: Carla Brodley.
Committee: Roni Khardon, Benjamin Hescott, Jennifer Dy, and Shuchin Aeron.
Keywords: Computer science, and Health care management.read less