The rapid pace at which the human genome project has proceeded has greatly benefited from two classes of short sequence tags, genomic (STS) and transcribed (EST), which are listed in two separate databases. Usually, STSs are random genomic sequences derived only for mapping purposes, while ESTs represent transcribed sequences that have to be mapped one by one. Here, we propose a way of establishing links between these two sets of sequences, allowing the automatic mapping of EST sequences by simple comparison with relatively nonrandom STSs. We suggest that EagI-based STSs derived by selected genomic portions organized in YAC contigs can automatically finely map a relevant portion of the ESTs, partially bridging the gap between the two sets of sequences and saving a great amount of time in mapping efforts. To test this principle, we have selected 330 high-quality STSs derived from the Xq24-qter region and used them for transcript searches by comparing them to the EST as well as to the nonredundant database. This search detected four known genes and two additional EST clones. In contrast, when the same databases were searched with a set of 53 sequences derived from the same chromosomal region around EagI sites, 7 known genes and 6 additional ESTs were found. These findings, together with data obtained from simulation analysis on long sequences in the same chromosomal region, suggest that EagI- based STSs can partially bridge the gap between STSs and ESTs.
ASJC Scopus subject areas