Prioritizing causal disease genes using unbiased genomic features
- Deo, R.C., Musso, G., Tasan, M., Tang, P., Poon, A., Yuan, C., Felix, J.F., Vasan, R.S., Beroukhim, R., De Marco, T., Kwok, P.Y., MacRae, C.A., Roth, F.P.
- Genome biology 15: 534 (Journal)
- Registered Authors
- MacRae, Calum A.
- MeSH Terms
- Artificial Intelligence*
- Cardiovascular Diseases/genetics
- Cardiovascular Diseases/pathology
- Disease Models, Animal
- Genetic Predisposition to Disease
- Genome-Wide Association Study
- Hypertrophy, Left Ventricular/genetics*
- Hypertrophy, Left Ventricular/pathology*
- Molecular Sequence Data
- 25633252 Full text @ Genome Biol.
Deo, R.C., Musso, G., Tasan, M., Tang, P., Poon, A., Yuan, C., Felix, J.F., Vasan, R.S., Beroukhim, R., De Marco, T., Kwok, P.Y., MacRae, C.A., Roth, F.P. (2014) Prioritizing causal disease genes using unbiased genomic features. Genome biology. 15:534.
Background Cardiovascular disease (CVD) is the leading cause of death in the developed world. Human genetic studies, including genome-wide sequencing and SNP-array approaches, promise to reveal disease genes and mechanisms representing new therapeutic targets. In practice, however, identification of the actual genes contributing to disease pathogenesis has lagged behind identification of associated loci, thus limiting the clinical benefits.
Results To aid in localizing causal genes, we develop a machine learning approach, Objective Prioritization for Enhanced Novelty (OPEN), which quantitatively prioritizes gene-disease associations based on a diverse group of genomic features. This approach uses only unbiased predictive features and thus is not hampered by a preference towards previously well-characterized genes. We demonstrate success in identifying genetic determinants for CVD-related traits, including cholesterol levels, blood pressure, and conduction system and cardiomyopathy phenotypes. Using OPEN, we prioritize genes, including FLNC, for association with increased left ventricular diameter, which is a defining feature of a prevalent cardiovascular disorder, dilated cardiomyopathy or DCM. Using a zebrafish model, we experimentally validate FLNC and identify a novel FLNC splice-site mutation in a patient with severe DCM.
Conclusion Our approach stands to assist interpretation of large-scale genetic studies without compromising their fundamentally unbiased nature.
Genes / Markers
Mutation and Transgenics
Human Disease / Model Data
Sequence Targeting Reagents
Engineered Foreign Genes
Errata and Notes