The Zebrafish Database Project
Specific Aim 1. A Multi-media Database. We
envision several kinds of data that will need to be included in these
databases:
- Image data: neuronal morphology, neuroanatomy, antibody staining
patterns, gene expression patterns, anatomical and developmental atlases.
- Text data: lists of zebrafish wild-type and mutant strains,
laboratory methods, information on DNA libraries and antibodies, gene protein
sequences, addresses of researchers, complete list of zebrafish research
publications, newsletter, general news.
- Graphical/spatial data: zebrafish genetic map, neuroanatomical and
morphological summary diagrams, physiological records.
Presently, we have examples of these three data types on-line. For example,
photos illustrating the zebrafish developmental staging series are presently
available as images on our existing WWW server [], as
are various text files of methods, descriptions of mutants, addresses etc.
Similarly, information about genetic linkage and the genetic map are available
as graphical/spatial data on our ACeDB server (contact ernest@lenti.umn.edu).
Limitations of present systems. A major limitation of our current
servers is that they can be accessed only by browsing, e.g. looking through a
list of information like the names of cloned genes. Ultimately we will want to be able to make much more
meaningful queries, like: What are all the genes known to be expressed by
neurons in the nucleus of the posterior commissure at 24 hours of development,
and what are the mutations known to perturb the expression of these genes?
Such queries, that search across data in essentially unlimited combinations,
require the full power of a relational database. Presently, such queries can be
done only by consulting several separate data lists at different sites, and
even if all the sites are known and consulted, it is unlikely that all the
relevant data could be found because the mechanisms for searching these data
lists are so primitive.
Fig 1. Relationship between neurons and gene expression patterns in
1-day zebrafish embryo. Left, image of neurons, including identified neurons of
the nucleus of the posterior commissure (oval), stained with HNK-1 antibody
[Wilson 90]. Right, expression pattern of the msxB homeobox gene in the
same nucleus [Akimenko95].
Example of how the proposed system will work. We propose to develop a
database that will facilitate complex interdependent searches. For example, to
learn whether genes guide axons, we might want to know which developmental
regulatory genes particular neurons express as they differentiate. Thus,
descriptors of these neurons in one anatomical record (Fig 1, left) would need
to be shared with the corresponding brain region in another record which shows
patterns of gene expression, part of a completely different data set (Fig 1,
right). We might then want to know more about the genes that these neurons
express. From the expression pattern (Fig 1, right) we would want to find the
gene sequence (Fig 2, upper right), information that could lead into public
databases like GenBank. Knowing the location of the gene on the genetic map
could lead to identification of mutations affecting the gene (Fig 2).
Fig 2. Relationships among msxB gene sequence (upper right,
Akimenko95), location on the genetic map of chromosome I (left,
Postlethwait94), and available mutant strains (lower right).
Descriptors of the mutants should identify the database containing images of
the brain region of interest as it appears in the mutants (Fig 3).
Fig 3. Phenotype of an msxB deletion mutation. GAP-43 expression in the head of mutant (right) and wild-type sibling (left) embryos at 1 day of development.
In practice,
all these databases should be linked in a manner allowing information to flow
in any direction. For instance, the information flow outlined here could just as well proceed in the opposite direction, starting with a newly identified mutant
phenotype, mapping it to a chromosomal location, identifying the corresponding gene, and retrieving the
wild-type expression pattern of this gene and information about neurons in the
affected region. Descriptors should also be shared with related data in human,
mouse, and invertebrate databases.
The Zebrafish Database
Continue on to Specific Aim 2
Return to Table of Contents