Discovering molecular mechanisms of human disease through gene sets and networks
Park, Jisoo.
2017
-
Abstract: Molecular
processes, such as genetic mutations and interactions between gene products, play a key
role in the development of disease. Yet, there is still not enough known about molecular
causes of disease. In this thesis, we introduce methods to discover hidden connections
between biological functions and disease using gene sets and networks. We first
investigate a topological property ... read moreof pathways in protein-protein interaction networks.
We define a new measurement called pathway centrality which measures the amount of
information flow between disease genes and differentially expressed genes handled by a
pathway. We find mediating pathways for three pulmonary diseases (asthma;
bronchopulmonary dysplasia (BPD); and chronic obstructive pulmonary disease (COPD))
using pathway centrality. Mediating pathways shared by all three pulmonary disorders are
mostly related to inflammation or immune responses and include specific pathways such as
cytokine production, NF Kappa B signaling, and JAK/STAT signaling. We confirm our
findings, which suggest new treatment approaches, both with anecdotal evidence from the
literature and via systematic evaluation using genetic interactions. Second, we identify
connections between developmental processes and disease by statistically testing
overlaps between developmental gene sets and disease gene sets. To handle missing
disease-gene association information, we pool disease genes from specific disease terms
to more general disease terms in a disease taxonomy. Our overlap analysis results for
nine developmental gene sets confirms many expected connections, such as those between
cardiovascular disorders and heart development genes. Closer investigation of our
results highlights some unexpected connections, such as ones between bone development
and dementia, heart development and polycystic ovary syndrome, and lung development and
retinopathy of prematurity. These connections have been further supported by recent
publications and again suggest novel therapeutic strategies. While successful, this work
highlights a need for more molecular disease taxonomies to improve the efficacy of gene
pooling. We ran a pilot study to infer disease hierarchies only using disease-gene
association information. We evaluate our inferred disease hierarchies by comparing to
existing ones because there is no gold standard molecular disease taxonomy available. We
find that our inference algorithm is able to recover much of the structure of existing
disease taxonomies. While inference is easier for smaller sets of disease terms, we
found some large disease categories where inference methods perform well, such as
Endocrine System Diseases, Nutritional and Metabolic Diseases, and Respiratory Tract
Diseases. We suspect that the existing hierarchies representing these disease categories
incorporate larger amounts of molecular data, perhaps because they include several
well-studied complex diseases. Overall, we have introduced new computational methods
that highlight novel connections between gene sets and diseases. We expect that our
studies will lead to deeper understanding of underlying mechanisms of human disease, and
ultimately to better support of molecular
medicine.
Thesis (Ph.D.)--Tufts University, 2017.
Submitted to the Dept. of Computer Science.
Advisor: Donna Slonim.
Committee: Lenore Cowen, Benjamin Hescott, Kyongbum Lee, and Teresa Przytycka.
Keywords: Computer science, and Bioinformatics.read less - ID:
- qj72pj865
- Component ID:
- tufts:22442
- To Cite:
- TARC Citation Guide EndNote