PUBLICATION

Harnessing Omics Big Data in Nine Vertebrate Species by Genome-Wide Prioritization of Sequence Variants with the Highest Predicted Deleterious Effect on Protein Function

Authors
Rozman, V., Kunej, T.
ID
ZDB-PUB-180511-8
Date
2018
Source
Omics : a journal of integrative biology   22(6): 410-421 (Journal)
Registered Authors
Keywords
PolyPhen (polymorphism phenotyping), SIFT (sorts intolerant from tolerant), big data, orthologous potentially deleterious variants (ortho-pDelVars), potentially deleterious variants (pDelVars), single-nucleotide polymorphism (SNP)
MeSH Terms
  • Animals
  • Cattle
  • Computational Biology/methods*
  • Horses
  • Mice
  • Rats
  • Sheep
  • Swine
  • Zebrafish
PubMed
29746212 Full text @ OMICS
Abstract
Harnessing the genomics big data requires innovation in how we extract and interpret biologically relevant variants. Currently, there is no established catalog of prioritized missense variants associated with deleterious protein function phenotypes. We report in this study, to the best of our knowledge, the first genome-wide prioritization of sequence variants with the most deleterious effect on protein function (potentially deleterious variants [pDelVars]) in nine vertebrate species: human, cattle, horse, sheep, pig, dog, rat, mouse, and zebrafish. The analysis was conducted using the Ensembl/BioMart tool. Genes comprising pDelVars in the highest number of examined species were identified using a Python script. Multiple genomic alignments of the selected genes were built to identify interspecies orthologous potentially deleterious variants, which we defined as the "ortho-pDelVars." Genome-wide prioritization revealed that in humans, 0.12% of the known variants are predicted to be deleterious. In seven out of nine examined vertebrate species, the genes encoding the multiple PDZ domain crumbs cell polarity complex component (MPDZ) and the transforming acidic coiled-coil containing protein 2 (TACC2) comprise pDelVars. Five interspecies ortho-pDelVars were identified in three genes. These findings offer new ways to harness genomics big data by facilitating the identification of functional polymorphisms in humans and animal models and thus provide a future basis for optimization of protocols for whole genome prioritization of pDelVars and screening of orthologous sequence variants. The approach presented here can inform various postgenomic applications such as personalized medicine and multiomics study of health interventions (iatromics).
Genes / Markers
Figures
Expression
Phenotype
Mutations / Transgenics
Human Disease / Model
Sequence Targeting Reagents
Fish
Antibodies
Orthology
Engineered Foreign Genes
Mapping