In vivo, the exon
regions of eukaryotic genes are transcribed as part of the heterogeneous nuclear RNAs (hnRNAs)
and are subsequently spliced together to form mRNAs.
In vitro,
reversal
of this process is possible by reverse transcription of
the complete set of mRNAs present
in any tissue. This yields a library of complementary DNAs (cDNAs)
that correspond to the genes expressed in that sample. Sequencing
a short segment (<100 bases) at one or both ends of each cDNA produces a "tag"
with enough information to identify the expressed gene:
these are called expressed sequence tags (ESTs).
Comparison of the ESTs from
a novel genome or tissue type with those from a genome that is
well-characterized allows identification of the function of gene
loci expressed in the new
organism.