F12-25smc2.jpg

Open & Closed Reading Frames in DNA

    Any piece of dsDNA can be “read” in six possible ways. There are two strands, either of which could be the sense strand read in the 5' 3' direction, and on either possible sense strand there are three possible start points, beginning at the 1st, 2nd, or 3rd nucleotides. Each of these six possibilities is called a reading frame.

    Because 3 of the 64 possible DNA triplets correspond to mRNA stop codons, a DNA sequence read at random will have 3-letter stop signals approximately once in every 20 triplets. The occurrence of multiple stops in a particular reading frame indicates that it does not code for a polypeptide: this is a "closed" reading frame. In contrast, an Open Reading Frame (ORF) can be read through several hundred triplets without encountering a stop sequence. ORFs are therefore candidates for protein-coding exon regions: the inferred amino-acid sequence can be compared with GenBank to identify possible analogous proteins.

    In the example above, the longest open runs in reading frames  ##1, 2, 3, 4, & 6 [gold boxes] are very short runs, typically < 100 bases of amino acids without stops. Reading frame #5 , read 5'3' right to left, includes an ORF [blue box] of more than 2,000 nucleotides, corresponding to a protein of more than 600 amino acids.


Figure © 2000 by Griffiths et al. ; text © 2012 by Steven M. Carr