Description |
-
Abstract: Molecular processes, such as genetic mutations and interactions between gene products, play a key role in the development of disease. Yet, there is still not enough known about molecular causes of disease. In this thesis, we introduce methods to discover hidden connections between biological functions and disease using gene sets and networks. We first investigate a topological property o... read moref pathways in protein-protein interaction networks. We define a new measurement called pathway centrality which measures the amount of information flow between disease genes and differentially expressed genes handled by a pathway. We find mediating pathways for three pulmonary diseases (asthma; bronchopulmonary dysplasia (BPD); and chronic obstructive pulmonary disease (COPD)) using pathway centrality. Mediating pathways shared by all three pulmonary disorders are mostly related to inflammation or immune responses and include specific pathways such as cytokine production, NF Kappa B signaling, and JAK/STAT signaling. We confirm our findings, which suggest new treatment approaches, both with anecdotal evidence from the literature and via systematic evaluation using genetic interactions. Second, we identify connections between developmental processes and disease by statistically testing overlaps between developmental gene sets and disease gene sets. To handle missing disease-gene association information, we pool disease genes from specific disease terms to more general disease terms in a disease taxonomy. Our overlap analysis results for nine developmental gene sets confirms many expected connections, such as those between cardiovascular disorders and heart development genes. Closer investigation of our results highlights some unexpected connections, such as ones between bone development and dementia, heart development and polycystic ovary syndrome, and lung development and retinopathy of prematurity. These connections have been further supported by recent publications and again suggest novel therapeutic strategies. While successful, this work highlights a need for more molecular disease taxonomies to improve the efficacy of gene pooling. We ran a pilot study to infer disease hierarchies only using disease-gene association information. We evaluate our inferred disease hierarchies by comparing to existing ones because there is no gold standard molecular disease taxonomy available. We find that our inference algorithm is able to recover much of the structure of existing disease taxonomies. While inference is easier for smaller sets of disease terms, we found some large disease categories where inference methods perform well, such as Endocrine System Diseases, Nutritional and Metabolic Diseases, and Respiratory Tract Diseases. We suspect that the existing hierarchies representing these disease categories incorporate larger amounts of molecular data, perhaps because they include several well-studied complex diseases. Overall, we have introduced new computational methods that highlight novel connections between gene sets and diseases. We expect that our studies will lead to deeper understanding of underlying mechanisms of human disease, and ultimately to better support of molecular medicine.
Thesis (Ph.D.)--Tufts University, 2017.
Submitted to the Dept. of Computer Science.
Advisor: Donna Slonim.
Committee: Lenore Cowen, Benjamin Hescott, Kyongbum Lee, and Teresa Przytycka.
Keywords: Computer science, and Bioinformatics.read less
|
This object is in collection