PUBLICATION

An in silico mining for simple sequence repeats from expressed sequence tags of zebrafish, medaka, Fundulus, and Xiphophorus

Authors
Ju, Z., Wells, M.C., Martinez, A., Hazlewood, L., and Walter, R.B.
ID
ZDB-PUB-051107-37
Date
2005
Source
In Silico Biology   5(5-6): 439-463 (Journal)
Registered Authors
Keywords
model fish, comparative genomics, in silico analysis, PTTGIP, EST, microsatellites, SSR
MeSH Terms
  • Amino Acid Sequence
  • Animals
  • Base Sequence
  • Cluster Analysis
  • Cyprinodontiformes/genetics*
  • DNA/genetics
  • Databases, Nucleic Acid
  • Expressed Sequence Tags
  • Fundulidae/genetics*
  • Genomics
  • Microsatellite Repeats*
  • Molecular Sequence Data
  • Oryzias/genetics*
  • Sequence Alignment
  • Species Specificity
  • Zebrafish/genetics*
PubMed
16268789
Abstract
Teleost fish genome projects involving model species are resulting in a rapid accumulation of genomic and expressed DNA sequences in public databases. The expressed sequence tags (ESTs) collected in the databases can be mined for the analysis of both structural and functional genomics. In this study, we in silico analyzed 49,430 unigenes representing a total of 692,654 ESTs from four model fish for their potential use in developing simple sequence repeats (SSRs), or microsatellites. After bioinformatical mining, a total of 3,018 EST derived SSRs (EST-SSRs) were identified for 2,335 SSR containing ESTs (SSR-ESTs). The frequency of identified SSR-ESTs ranged from 1.5% for Xiphophorus to 7.3% for zebrafish. The dinucleotide repeat motif is the most abundant SSR, accounting for 47%, 52%, 64%, and 78% for medaka, Fundulus, zebrafish, and Xiphophorus, respectively. Simulation analysis suggests that a majority of these EST-SSRs have sufficient flanking sequences for polymerase chain reaction (PCR) primer design. Comparative DNA sequence analyses of SSR-ESTs identified several cross-species SSRs and sequences that may be used as cross-reference genes in comparative studies. For example, the flanking sequences of one SSR (CTG)n within the pituitary tumor-transforming gene (PTTG) 1 interacting protein (PTTGIP), showed conservation spanning the medaka, Fundulus, human, and mouse genomes. This study provides a large body of information on EST-SSRs that can be useful for the development of polymorphic markers, gene mapping, and comparative genome analysis. Functional analysis of these SSR-ESTs may reveal their role in metabolism and gene evolution of these model species.
Genes / Markers
Figures
Expression
Phenotype
Mutations / Transgenics
Human Disease / Model
Sequence Targeting Reagents
Fish
Antibodies
Orthology
Engineered Foreign Genes
Mapping