The Genetic Code


The Central Dogma: DNA makes RNA makes protein
Central Dogma

 

In principle: The DNA genotype does not produce the phenotype directly
   
A DNA gene contains the information necessary for the production of proteins,
        which is expressed biochemically through an intermediate molecule, RNA,
        which functions as a Genetic Code


The
Genetic Code is an RNA code that specifies amino acids that make up proteins
    was "cracked" before the biochemistry of translation was understood:
         we can talk about the Code before describing RNA translation

Alternative alleles of genes arise by mutation
     which alters the DNA sequence of genes  
         which may cause amino acid substitutions in proteins
             which may affect the function of those proteins
     which appear in the population as SNPs (Single Nucleotide Polymorphisms)
         Most genes are highly polymorphic



The Genetic Code is ... 

        a messenger RNA (mRNA) code
            i.e.., the code is written in RNA
            DNA carries genetic information,
                    but is not  the 'genetic code' in the biochemical sense

        in 64 codons : 61 for amino acids + 3 'stops' [Table]
               mRNA codons are read 5'3'
               20 amino acids:  note 1- & 3-letter abbreviations
                                          [more on amino acids & proteins in next section]
                For example,

       Degenerate: most amino acids are encoded by more than one codon
            first two positions are critical: third position can "wobble [see next section]
                  if third can be either puRine (R), or either pYrimidine (Y)
                      two-fold degeneracy
                  if third can be any base 
                      four-fold degeneracy
                  Leucine (leu) has six-fold degeneracy with six codons in unusual arrangement:

Amino Acid
# codons
trp, met
1 @
ser, arg, leu
6 @
ile
3 @
Tyr, Cys, His, Gln,    Asn, Lys, Asp, Glu
2 @
Ser, Pro, Arg, Thr,     Val, Ala, Gly
4 @

     Unambiguous: any one triplet codes for only one amino acid
                but not vice versa, because of wobble

        'Always' begins with an 'start' or 'initiator' codon:  AUG

        'Always' ends with a 'stop' or 'terminator' codon:  UAG, UAA, or UGA

     Universal (with some important exceptions)
            Five Kingdoms (animals, plants, algae, fungi, & monera)
                        use same code for nuclear DNA (nucDNA)

                Organelles (chloroplasts & mitochondria) have separate genomes:
                cpDNA & mitochondrial DNA codes evolutionarily modified
                   e.g., UGA codes for trp (W) in vertebrate mtDNA code 
                              AUN may act as Start Codons in invertebrate mtDNA code
                             Stop codons may be formed by addition of "
A"s to transcript
                             Lab exercises use mtDNA data, so the mtDNA code is important
                 27+  alternative codes
in various evolutionary lineages



Alteration & Variation in the Genetic Code: Mutations & SNPs

    Mutations - interchanges of one base type for another
        transitions   - alternative pyrimidines [ CT ]  or purines [ AG ]
        transversions -  purines pyrimidines [C / T A / G]

     Recognized in individuals & populations as SNPs ("snips": single nucleotide polymorphisms)
                [SNPs, Mutations, & Mutants: a note on terminology & some lessons from history]

        Alternative nucleotide sequences of a gene correspond to alternative alleles
             or: a single gene occurs in variant forms (alleles)

  Single-base mutations
        Consequences of exon SNPs depend on position in triplet

            3rd position
                 typically a silent mutation - if position "wobbles", no change to amino acid
                 sometimes a mis-sense mutation - results in different amino acids

           2nd position - always a mis-sense mutation
           1st position - almost always a mis-sense replacement
                                      [Leu codons are major exception]
            stop codon mutations may occur at any position: coding  non-coding triplet
                non-sense (termination) mutations terminate polypeptides prematurely
                HOMEWORK #8: Identify all codons one step away from a termination codon

        mutations in non-coding DNA have variable effects
               Ex.: mutations in promoter regions
                       mutations at intron / exon splice junctions

Mis-sense mutations in DNA cause amino acid substitutions in protein
   Proteins do not mutate!
      Consequences depend on position of substitution in polypeptide
        none:  substitution not in active site or binding site
        minor: substitution of same type (synonymous substitution)
             Allozymes are enzymes arising from allelic variation of enzyme genes
        major: substitution affects structure / function (nonsynonymous substitution)
             Ex.: Glu Val   in beta-globin  produces Sickle-cell hemoglobin (HbS)
                         HOMEWORK #9: What is the DNA mutation involved?

Insertion / Deletion ("indel") mutations
        gain or loss of one or two nucleotides alters the reading frame
        frameshift mutations  (examples)
              single & double nucleotide indel downstream amino acids change
                    non-sense mutation eventually (quickly) produced
              triplet indel - insertion / deletion of single amino acid
                   typically milder consequences
                   multiple triplet insertions produce major effects
                       Ex.: CGG repeats in
"Fragile X" Syndrome
         length mutations - very large indels (102~6 bps)


Genes are highly polymorphic (w/ multiple alleles) wrt their SNP variation
       
[Concept of "wild type" allele is erroneous]

        Phenylalanine Hydroxylase (PAH) (OMIM citation 261600)
             has 14 exons, encodes 2.4kb mRNA for 452 amino acid protein

        Among 68 alleles that affect enzymatic activity of PAH  [GenBank List]
                68% miss-sense SNPs (many produce Phenylketonuria (PKU))
                13% non-sense SNPs (premature termination)
                  9% indel SNPs
(single base 1~5 triplets whole exon)
                10% splice-site SNPs (including most common variant allele)
              
        Most allelic variants of the PAH locus are 3rd position silent:
                no affect on PAH expression
                & therefore undetected



Homework #10
:
     (1) "What is a Gene?" Write a one-paragraph essay that that distinguishes Gene, Allele, and Locus
    
(2) Critique the following statements:
           
"PAH is the gene for Phenylketonuria (PKU)."
            "PKU is a genetic disease caused by absence of the PAH
gene."


Text material ©2024 by Steven M. Carr