PUBLICATION

Polymorphism, shared functions and convergent evolution of genes with sequences coding for polyalanine domains

Authors
Lavoie, H., Debeane, F., Trinh, Q.D., Turcotte, J.F., Corbeil-Girard, L.P., Dicaire, M.J., Saint-Denis, A., Page, M., Rouleau, G.A., and Brais, B.
ID
ZDB-PUB-031106-1
Date
2003
Source
Human molecular genetics   12(22): 2967-2979 (Journal)
Registered Authors
Keywords
none
MeSH Terms
  • Amino Acid Sequence
  • Animals
  • Caenorhabditis elegans/genetics
  • Chickens/genetics
  • Codon
  • Conserved Sequence
  • Drosophila melanogaster/genetics
  • Evolution, Molecular*
  • Genes*
  • Genome, Human
  • Homeodomain Proteins
  • Humans
  • Peptides/chemistry*
  • Phylogeny
  • Polymorphism, Genetic*
  • Protein Structure, Tertiary
  • Repetitive Sequences, Amino Acid
  • Vertebrates/genetics
  • Zebrafish/genetics
PubMed
14519685 Full text @ Hum. Mol. Genet.
Abstract
Mutations causing expansions of polyalanine domains are responsible for nine hereditary diseases. Other GC-rich sequences coding for some polyalanine domains were found to be polymorphic in human. These observations prompted us to identify all sequences in the human genome coding for polyalanine stretches longer than four alanines and establish their degree of polymorphism. We identified 494 annotated human proteins containing 604 polyalanine domains. Thirty-two percent (31/98) of tested sequences coding for more than seven alanines were polymorphic. The length of the polyalanine-coding sequence and its GCG or GCC repeat content are the major predictors of polymorphism. GCG codons are over-represented in human polyalanine coding sequences. Our data suggest that GCG and GCC codons play a key role in polyalanine-coding sequence appearance and polymorphism. The grouping by shared function of polyalanine-containing proteins in Homo sapiens, Drosophila melanogaster and Caenorhabditis elegans shows that the majority are involved in transcriptional regulation. Phylogenetic analyses of HOX, GATA and EVX protein families demonstrate that polyalanine domains arose independently in different members of these families, suggesting that convergent molecular evolution may have played a role. Finally polyalanine domains in vertebrates are conserved between mammals and are rarer and shorter in Gallus gallus and Danio rerio. Together our results show that the polymorphic nature of sequences coding for polyalanine domains makes them prime candidates for mutations in hereditary diseases and suggests that they have appeared in many different protein families through convergent evolution. http://hmg.oupjournals.org/cgi/content/full/12/22/2967
Genes / Markers
Figures
Expression
Phenotype
Mutations / Transgenics
Human Disease / Model
Sequence Targeting Reagents
Fish
Antibodies
Orthology
Engineered Foreign Genes
Mapping