ZMAP, AN INTEGRATED MAP OF THE ZEBRAFISH
GENOME
by Allen Day, Tom
Conlin, and John H. Postlethwait
Institute of
Neuroscience, University of Oregon
PURPOSE
For the molecular
cloning of genes originally identified by mutations, it is
important to know all mapped markers residing near a mapped
mutation. Mapped coding sequences provide information
on potential candidate loci and conserved chromosome
segments, and mapped anonymous markers such as
microsatellites provide highly polymorphic markers for fine
structure mapping.
To facilitate
these strategies, we have produced a consolidated map
containing all of the coding sequences currently positioned
on the two large radiation hybrid (RH) mapping panels (the
T51 panel, Kwok et al., 1998; Geisler et al., 1999; and the
LN54 panel, Hukriede et al., 1999; Chevrette et al., 2000)
and the doubled haploid (heat shock, HS) meiotic mapping
panel (Kelly et al., 2000; Postlethwait et al., 2000; Woods
et al., 2000). We intercalated these sequences into the
complete diploid meiotic map of microsatellites (MGH panel,
Knapik et al., 1996; Shimoda et al., 1999). The MGH
microsatellite map was used as the standard because all of
the other maps first constructed a framework map of
microsatellites into which they have inserted the loci for
coding sequences.
METHODS
The following
strategy was used to intercalate the positions of coding
sequences into the MGH map. We assumed at the outset
that, at least as a first approximation, the distances
between closely linked markers would be approximately
linearly related when any given map was compared to the MGH
microsatellite map. This assumption is an
approximation, but for closely linked markers, the errors
introduced by this assumption are likely to be
small.
Data for the
composite map were imported from data for the four panels
stored in ZFIN. For each locus on each map, two
intervals were determined: the interval from the coding
sequence to its nearest common microsatellite marker, and the
interval from the coding sequence to its second-nearest
microsatellite marker, with the condition that the two
microsatellite markers were not closer to one another than
the nearest microsatellite was to the coding sequence.
This reduced magnification of error from the reported
microsatellite marker positions. In cases where two
qualifying microsatellite markers did not exist, we did not
assign a position. The ratio of the two intervals was
then used to intercalate the coding sequence into the MGH
microsatellite map.
USING THE MAP
This integrated
map is only as good as the original data sets from which the
data were taken, with additional uncertainty introduced by
the assumptions necessary to perform the intercalation
procedure. No independent attempt was made to verify
the locations of markers, and so there are cases in which the
same marker, or different markers from the same unigene
cluster, appear in multiple locations. In these
conflicting cases, we display all locations and note them
with asterisks following the marker names.
In cases of
conflicting map data, there is no reliable way, a priory, to
decide which of several positions may be in error.
Sources of potential error include mosaic EST clones, errors
in labeling samples or gel lanes, errors in the scoring of
mapping gels, errors in the recording of data, and violation
of the assumption of co-linear maps. By examining all
data in a common framework, however, it may be possible to
determine which of several positions is in error. For
coding sequences with reported positions that cluster, a
simple measure of variance will reveal which loci are most
likely to be erroneous.
In the current ZMAP, all locations of loci were
included. For the purposes of providing candidate
markers for mutations, it seems more prudent to err on the
side of the inclusion of all data. The cost of a marker
appearing in an interval in which it does not belong is that
a few false positives will be linked closely to one's
mutant. Simple experiments can quickly reveal loci that
do not belong in a given interval, and we hope that you will
contact us when you discover such a "mistake" in the
map. On the other hand, if we inappropriately exclude
data that are correctly positioned, we will generate false
negatives, and researchers will miss potential
candidates. This type of error is much more difficult
to reveal.
ACKNOWLEDGEMENTS
Thanks are due to
ZFIN, Judy Sprague, Monte Westerfield, and NIH grants
R01-RR10715 to JHP, P01-HD22486 to M. Westerfield and J.
Postlethwait, and P40-RR12546 to the Zebrafish International
Resource Center.
REFERENCES
Chevrette M, Joly L, Tellis P, Knapik EW, Miles J, Fishman M,
Ekker M (2000) Characterization of a zebrafish/mouse somatic
cell hybrid panel. Genomics. 64, 119-126.
Geisler, R., Rauch, G.J., Baier, H., van Bebber, F., Brobeta,
L., Dekens, M.P., Finger, K., Fricke, C., Gates, M.A.,
Geiger, H., Geiger-Rudolph, S., Gilmour, D., Glaser, S.,
Gnugge, L., Habeck, H., Hingst, K., Holley, S., Keenan, J.,
Kirn, A., Knaut, H., Lashkari, D., Maderspacher, F., Martyn,
U., Neuhauss, S., Haffter P, et al. 1999. A radiation hybrid
map of the zebrafish genome. Nat. Genet. 23:86-89.
Hukriede, N., L. Joly, M. Tsang, J. Miles, P. Tellis, J.
Epstein, W. Barbazuk, F. Li, B. Paw, J. Postlethwait, T.
Hudson, L. Zon, J. McPherson, M. Chevrette, I. Dawid, S.
Johnson, and M. Ekker (1999) Radiation hybrid mapping of the
zebrafish genome. Proc. Natl. Acad. Sci.,USA
96:9745-9750.
Kelly, P.D., F. Chu, I.G. Woods, P. Ngo-Hazelett, T. Cardozo,
H. Huang, F. Kimm, L. Liao, Y.-L. Yan, Y. Zhou, S.L. Johnson,
R. Abagyan, A.F. Schier, J.H. Postlethwait, and W.S. Talbot
(2000) Genetic linkage mapping of zebrafish genes and ESTs.
Genome Res. 10:558-567.
Knapik, E. W., Goodman, A., Atkinson, O. S., Roberts, C. T.,
Shiozawa, M., Sim, C. U., Weksler-Zangen, S., Trolliet, M.
R., Futrell, C., Innes, B. A., Koike, G., McLaughlin, M. G.,
Pierre, L., Simon, J. S., Vilallonga, E., Roy, M., Chiang, P.
W., Fishman, M. C., Driever, W., and Jacob, H. J. (1996). A
reference cross DNA panel for zebrafish (Danio rerio)
anchored with simple sequence length polymorphisms.
Development 123:451-460.
Kwok, C., Korn, R.M., Davis, M.E., Burt, D.W., Critcher, R.,
M cCarthy, L., Paw, B.H., Zon, L.I., Goodfellow, P.N.,
and Schmitt, K. 1998. Characterization of whole genome
radiation hybrid mapping resources for non-mammalian
vertebrates. Nucleic Acids Res. 26:3562 3566.
Postlethwait, J.H., Woods, I.G., Ngo-Hazelett, P., Yan,
Y.-L., Kelly, P.D., Chu, F., Huang, H., Hill Force, A., and
Talbot, W.S. (2000) Zebrafish comparative genomics and the
origins of vertebrate chromosomes. Genome Res.
10:1890-1902.
Shimoda, N., Knapik, E.W., Ziniti, J., Sim, C., Yamada, E.,
Kaplan, S., Jackson, D., deSauvage, F., Jacob, H., and
Fishman, M.C. (1999) Zebrafish genetic map with 2000
microsatellite markers. Genomics. 58:219-232.
Woods, I.G., Kelly, P.D., Chu, F., Ngo-Hazelett, P., Yan,
Y.-L., Huang, H., Postlethwait, J.H., and Talbot, W.S. (2000)
A comparative map of the zebrafish genome. Genome Res.
10:1903-1914.
Links to primary data:
T51 RH panel:
http://wwwmap.tuebingen.mpg.de/
and
http://zfrhmaps.tch.harvard.edu/ZonRHmapper/
LN54 RH panel:
http://mgchd1.nichd.nih.gov:8000/zfrh/current.html
and
http://www.genetics.wustl.edu/fish_lab/frank/cgi-bin/fish/rhmaps/rh.html
HS panel:
http://cmgm.stanford.edu/~tallab/HeatShock99/Raw_Data/
MGH microsatellite map:
http://zebrafish.mgh.harvard.edu/mapping/ssr_map_index.html