|MIT Department: Computational and Systems Biology
Faculty Mentor: Prof. Ernst Fraenkel
Undergraduate Institution: University of Maryland, Baltimore County
I am studying Physics and Computer Science, I am particularly interested in the intersection of Physics, Biology, and Computer Science. I am truly interested in pursuing a career in biomedical engineering and creating devices that better people’s lives.
Using Supervised Machine Learning Methods to Create a Gene-Based ALS Predictor from Postmortem Transcriptomics Data
Christopher Bain1, Divya Ramamoorthy2, Ernest Fraenkel2
1Department of Physics and Biomedical Engineering, University of Maryland,
2Department of Systems and Computational Biology,
Massachusetts Institute of Technology
Amyotrophic Lateral Sclerosis is a fatally progressive, paralytic disorder characterized by the degeneration of motor neurons in the brain and spinal cord. Typically, death due to respiratory paralysis occurs within 3 to 5 years. While several pathogenic mutations have been identified, the vast majority of ALS cases have no family history of disease. Thus, for most ALS cases, the disease may be a product of multiple pathways contributing to varying degrees in each patient. To further assess this case, we use logistic regression and other supervised machine learning methods to analyze a set of ~300 patients’ genetic profiles in order to identify clusters of patients with similar transcriptomic markers. Identification of such clusters would allow us to construct an algorithm that can take in a patient’s genes and make an accurate prediction for their likelihood of developing ALS in the future. Moreover, finding significant similarities in genes of patients could give crucial insights to the future of neurodegenerative research and further care for ALS patients. We were able to construct an agent that was able to predict if a patient had ALS with 86.7% accuracy. In addition, using the coefficient scores from the method we were able to find correlations between the heavily weighted genes and compensatory mechanisms the body implements in ALS patients. We believe Logistic Regression to be a strong classifier for identifying both the likelihood of a patient having ALS as well as identifying correct weights for specific genes.