Biochemistry 3107 - Fall 2003

Eukaryotic Transcription

 

Eukaryotic transcription

Eukaryotic transcription is more complex than prokaryotic transcription and, until recently, it has seemed that every eukaryotic gene was unique requiring its own transcription machinery.

However, it is now possible to simplify the story somewhat. The promoters for different genes are different. Each contains a combination of sites to which specific protein factors bind. All of these factors help RNA polymerase to bind in the correct place and to initiate transcription. However, the repertoire of transcription factors and transcription factor binding sites is not unlimited.

There are three distinct RNA polymerases in a eukaryotic cell nucleus which define the three major classes of eukaryotic transcription unit:

 

polymerase location type of RNA transcribed sensitivity to a-amanitin§
I nucleus/nucleolus rRNA
(except for 5S rRNA)
resistant
II nucleus hnRNA
(i.e. pre-mRNA)
very sensitive
III nucleus small RNA such as tRNA and 5S rRNA moderately sensitive

 

§a-amanitin is the toxin found in the poisonous deathcap mushroom - Amanita phalloides

 

 

Each type of RNA polymerase is a complex of many polypeptide subunits. Voet and Voet state it best:

 "Eukaryotic RNA polymerases are characterized by subunit compositions of Byzantine complexity".

There may be as many as 14 subunits in an eukaryotic RNA polymerase; the total molecular weight is typically 500-700 kD.

Eukaryotic RNA polymerases cannot find or bind to a promoter by themselves. Each requires the binding of assembly factors and a positional factor to locate the promoter and to orient the polymerase correctly. As we will see, the positional factor is the same in all cases.

 

Class I Transcriptional Units

Class I genes or transcriptional units are transcribed by RNA polymerase I in the nucleolus. The best-studied examples are the rRNA transcription units:

[MVH28-20]

 

 This picture is the Entrez graphical representation of the Tetrahymena thermophila rDNA containing the 17S, 5.8S and 26S rRNA genes. Note the intron in the 26S gene.

 

Each transcription unit consists of 3 rRNA genes: 18S, 5.8S, and 28S; and each unit is separated by a nontranscribed spacer. Eukaryotic nucleoli typically have many hundreds of copies of these transcription units tandemly arranged.

 

The enzyme

RNA polymerase I is a complex of 13 subunits. It is insensitive to a-amanitin.

[25-11a]

 

The promoter

The CORE promoter region is located from -31 to +6 around the transcription startpoint. Another sequence further upstream, called the upstream control element (UCE), located from -187 to -107 is also required for efficient transcription.

Both elements are closely related; there is approximately 85% sequence identity between them. These elements are also unusual in that they are GC-rich. In general, sequences around the start-point of transcription tend to be AT-rich so that melting of the DNA duplex is easier.

 

Assembly of a transcriptional complex

Two additional transcription factors are known to be required to assist RNA polymerase I.

UBF1

is a single polypeptide which binds to the upstream control element (UCE) and to the CORE promoter. UBF1 recognizes a GC-rich sequence within these elements. UBF1 is an assembly factor.

 

SL1

binds to UBF1. It consists of 4 proteins, one of which is TATA-box binding protein (TBP).

TBP is required for the assembly of a transcriptional complex in all 3 classes of eukaryotic transcription unit. SL1 is a positional factor - it targets RNA polymerase at the promoter so that it initiates transcription in the correct place.

 

Once UBF1 and SL1 have formed a complex, RNAP I binds to the CORE promoter to initiate transcription:

 

 

[More detailed diagram of Class I assembly]

 

 

Class II Transcription Units

All genes that are transcribed and expressed via mRNA are transcribed by RNA polymerase II.

Until recently, it was common to think of eukaryotic transcription (and particularly mRNA synthesis) as taking place in discrete steps: transcription, capping, tailing, splicing and export from the nucleus for translation. The contemporary view of eukaryotic geen expression entails simultaneous transcription and processing. Recent discoveries have revealed that many of the protein factors required for these individual steps do, in fact, interact with one another. This makes sense for it allows the cell to coordinate and regulate the complete process more efficiently.

The two images below are from a recent review article. Study them carefully to see how the contemporary view of gene expression (right) contrasts with the more traditional view (left).

 

IMAGES FROM: G. Orphnides and D. Reinberg (2002) A Unified Theory of Gene Expression. Cell 108: 439-451.

 

The enzyme

RNA polymerase II is a complex multisubunit enzyme - the yeast enzyme has 12 subunits. The largest subunit conatins the catalytic activity. The enzyme is very sensitive to a-amanitin (K=10-8M).

[25-11b]

RNA polymerase II can transcribe RNA from nicked dsDNA templates or from ssDNA templates. However, by itself, it cannot initiate transcription at a promoter. In this respect, it resembles the core form of bacterial RNA polymerase.

 

The promoter

Promoters used by RNA polymerase II have different structures depending upon the particular combination of transcription factors that are required to build a functional transcriptional complex at each promoter. Nevertheless, these different structures can be viewed as a combination of a relatively limited number of specific sequence elements.

Some of the common elements that have been described in class II eukaryotic promoters are the following:

The following diagram shows some examples of eukaryotic promoters and the combination of sequence elements that they contain:

 Diagram based on and adapted from Figure 28.26 of Mathews & van Holde, Biochemistry, 2nd ed.

 

 

In addition to the above elements, Enhancers may be required for full expression. These elements are not part of the promoter per se. They can be located upstream or downstream of the promoter and may be quite far away from it. The mechanism by which they work is not known. They may provide an entry point for RNA polymerase or they may bind other proteins that assist RNA polymerase to bind to the promoter region. [figure]

 

The transcriptional complex

When it was first purified and characterized, it was found that RNA polymerase II can transcribe mRNA in vitro as long as a suitable template -- such as a nicked dsDNA or ssDNA -- is provided . The fact that the enzyme could not initiate transcription correctly on a dsDNA template indicated that RNA polymerase II could not function alone in the cell nucleus and a search was begun for additional transcription factors.

At least six general (or basal) transcription factors (TFIIA, TFIIB, TFIID, TFIIE, TFIIF, TFIIH) have been characterized. In the presence of these transcription factors, the enzyme is able to initiate transcription at promoters correctly. However, even in the presence of transcription factors, the enzyme complex is unable to recognize and respond to regulatory signals.

In addition to the general transcription factors, the trsncriptional complex will also be affected by the presence of an promoter-proximal regulatory sequences and the presence of transcription factors that bind to those sequences. Such factors may be present in some cells/tissues but not in others. For example, the octamer motif (shown for the histone H2B gene above) binds two different transcription factors: Oct-1 and Oct-2. Oct-1 is ubiquitous but Oct-2 is expressed only in lymphoid cells where it activates immunoglobulin k light chain gene transctiption.

A search for missing factors which would allow the transcriptional complex to respond to regulatory signals has revealed the role of a number of new protein complexes. The first complex to be identified is called the Srb-Mediator. Others that have since been identified are Srb10-CDK and Swi-Snf. These protein complexes confer a regulatory response capability on RNA polymerase II.

It has been suggested that the combination of RNA polymerase II, transcription factors, and regulatory response complexes such as the Srb-Mediator, is the eukaryotic equivalent of a holoenzyme. This view is still subject to some debate.

 

 Diagram based on and adapted from Fig. 1 of Struhl, K., Cell 84: 179-182

 

Assembly of a transcriptional complex

Many studies have been carried out to determine the way in which a transcriptional complex is assembled at a class II promoter and in particular in which order the various protein factors assemble.

The following image shows a model for the assembly of a Class II eukaryotic transcription complex:

NOTE: The above image may be restricted to users from licensed or registered sites.

[MVH28-27]

The basic process is likely to include the following steps:

  1. TFIID recognizes and binds to the TATA box.

    TFIID consists of TATA box binding protein - TBP and ~10 TBP associated factors - TAFs.

    TBP is a 180 amino acid protein that consists of two very similar 66 amino acid domains separated by a short basic region. The protein has a "saddle-shaped" structure that sits astride a DNA molecule and binds to it via contacts in the minor groove. Binding also causes an 80° bend in the DNA.

    [25-13a] [25-13a]

    TFIID is a positional factor - it targets RNA polymerase to the promoter. In the case of class II transcriptional units, however, TBP binds directly to DNA.

     The Structure of TATA binding Protein (TBP) with DNA from the Hahn laboratory at the Fred Hutchinson Cancer Research Center in Seattle.


  2. TFIIA binds and stabilizes TFIID binding.

     Structure of the TFIIA-TBP-DNA Complex from the Hahn laboratory at the Fred Hutchinson Cancer Research Center in Seattle.


  3. The RNA polymerase II holoenzyme assembles - possibly in a stepwise manner - to form a preinitiation complex.

    [25-14] [MVH28-24]

    The holoenzyme consists of the RNA polymerase II complex, the regulatory complexes and the following transcription factors:
    • TFIIB

      TFIIB is a single polypeptide. It can bind both upstream and downstream of the TATA box (i.e. closer to the startpoint of transcription). It recruits TFIIF-RNAPII to the complex. It may interact directly with RNAP II.



    • TFIIE

      TFIIE is a complex of two subunits. It recruits TFIIH to the complex thereby priming the initiation complex for promoter clearance and elongation.

    • TFIIF

      TFIIF also has two subunits - RAP38 & RAP74. The latter has a helicase activity and may therefore be involved in melting the DNA at the promoter to expose the template strand.

      More about TFIIF from Zachary Burton's web pages at Michgan State University.


    • TFIIH

      TFIIH is a complex of 9 subunits. One of the subunits has a kinase activity that carries out the phosphorylation that is required for promoter clearance (see below).

      The two largest subunits (XPB and XPD) have helicase activity; this activity of TFIIH is also required for Nucleotide Excision Repair in the cell and mutations in these subunits are associated with three genetic disorders: Xeroderma pigmentosum, Cockayne's disease (repair defects) and Trichothiodystrophy (a transcription defect). Another subunit is a cyclin (cdk7 - cyclinH)

 

There is some evidence that the order of assembly of transcription factors may be TFIID -> TFIIA -> TFIIB -> (TFIIF + RNAP II) -> TFIIE -> TFIIH

[25-15]

Finally, the various regulatory factors (Srb-Mediator, Srb10-CDK and Swi-Snf) bind to complete formation of the pre-initiation complex.

The adjacent diagram from Benoit Coulombe's laboratory at the Clinical Research Institute of Montreal illustrates a proposed structure for the pre-initiation complex containing TBP, TFIIB, TFIIF (F74-and F30), TFIIE (E56 and E34), TFIIH and RNAPII. Note that there are two copies of TFIIE and TFIIF in the complex.

The above image is found at http://www.ircm.qc.ca/benoitcoulombe/A-00fig7a.html

You can see more models of transcription complexes by visiting Benoit Coulombe's models page and you can also view a Quicktime VR animation of this complex, a model for the mechanism of promoter melting by TFIIH,

 

 

  1. The carboxy terminal domain (CTD) of the largest subunit of RNA polymerase II is phosphorylated.

    This results in promoter clearance. RNA polymerase II dissociates from the Transcription factors and other protein complexes that were required for assembly. The following images show that phosphorylation of CTD occurs in stages and helps to recruit the capping and splicing machinery to the elongation complex.

    IMAGES FROM: G. Orphnides and D. Reinberg (2002) A Unified Theory of Gene Expression. Cell 108: 439-451.


The CTD consists of 52 repeats of the amino acid sequence Y-S-P-T-S-P-S. Ser5 is phosphorylated by tthe kinase activity of TFIIH

TFIIH phosphorylates Ser5. This serves to recruit the transcription elongation factor DSIF to the complex, which in turn recruits another elongation factor, NELF, which arrests transcription. This pause permits the capping enzymes to join and modify the 5' end of the transcript. A third elongation factor P-TEFb (a kinase) joins and phosphorylates CTD and NELF, neutralising them. P-TEFb phosphorylates CTD at Ser2.

 

[Images and some text moved to start of section]

There is also evidence that an additional transcription factor, TFIIS, participates in transcription elongation.

 

 

Transcription factors

As of the latest release of TRANSFAC, a transcription factor database, in 2001, it contained 2785 entries. Many of these are homologous proteins from different species, nevertheless this number is indicative of the vast number of transcription factors now known that regulate the expression of eukaryotic genes. Any detailed treatment of these factors is way beyond the scope of this course.

Transcription factors are the ultimate targets of cell-signalling pathways. Whenever cells need to response to an extracellular signal such as a hormone, the response is mediated by a change in gene expression that comes about, most often as the result of a change in the phosphorylation state of a transcription factor.

For example, growth hormone binding to its receptor catalyses the autophosphorylation of a tyrosine residue in the cytoplasmic domain of the receptor. This, in turn is recognised by the SH2 (Src homology 2) domain of a cytoplasmic response protein which through its further interactions activates the Ras protein. Ras is a G protein that is active when GTP is bound but inactive when GDP is bound. Ras then activates a series of kinases until, finally, one of these migrates to the nucleus where it phosphorylates a transcription factor such as Fos, Jun or Myc.

[MVH23-23] [21-15] [21-16] [MVH23-24]

The importance of transcription factors in the regulation of gene expression can be illustrated by looking at the regulation of lipid metabolism. Over 30 different transcription factors are involved in the regulation of lipid metabolism genes. The PPAR and SREBP families of transcription factors are particularly important. Fatty acid oxidation is regulated by PPAR factors; cholesterol homeostasis is regulated by SREBP factors.

The following diagrams illustrate networks of genes regulated by PPAR factors (which are Zn finger proteins) [LEFT] and a network of cholesterol metabolism genes controlled by SREBP [RIGHT]:

Here and in Fig. 2, circles indicate proteins; rectangles show genes coding for these proteins. ACO, acyl-CoA oxidase; ACS, acyl-coenzyme A synthetase; apoAI - apolipoprotein AI; apoCIII - apolipoprotein CIII; GR - glucocorticoid receptor; HD, Hydratase-dehydrogenase; HNF4, hepatocyte nuclear factor 4; PPAR - peroxisome proliferator activated receptor.

ACC, acetyl coenzyme A carboxylase; f.a. - fatty acids; FAS, fatty acid synthase; FDPS - farnesyl diphosphate synthase; HMG-CoA-R, 3-hydroxy-3-methylglutaryl CoA reductase; HMG-CoA-S, 3-hydroxy-3-methylglutaryl CoA synthase; LDL, low density lipoprotein;
LDLR, low density lipoprotein receptor; preSREBP sterol regulatory element-1 binding protein precursor; SREBP - sterol regulatory element-1 binding protein; SRP -
sterol-regulated protease; SS, squalene synthase; LDL, very low density lipoprotein

Above diagrams taken from THE LIPID METABOLISM TRANSCRIPTION REGULATORY REGIONS DATABASE (LM-TRRD) of Dr. E.V. IGNATIEVA at the Laboratory of Theoretical Genetics, Institute of Cytology and Genetics (Siberian Branch of the Russian Academy of Sciences), Novosibirsk, Russia. [The above links may be slow]

 

 

Class III Transcription Units

Class III genes are principally those for small RNA molecules in the cell. The best studied examples are the 5S rRNA gene -- which has been studied extensively in Xenopus laevis, and tRNA genes.

The enzyme

RNA polymerase III is the largest of the three RNA polymerases with 17 subunits and a molecular weight of over 700 kD. It is moderately sensistive to a-amanitin. It is also the most active.

 

The promoter

Class III promoters are distinctive because some of them are located within the gene whose transcription they direct.

The promoters for 5S rRNA and tRNA genes are located within the gene. In the case of the Xenopus laevis 5S rRNA gene, which is 120 bp in length, it has been found that the segment from +41 to +87 is sufficient to direct transcription and therefore defines the promoter.

The promoters for snRNA genes lie upstream of the startpoint of transcription.

 

Assembly of a transcriptional complex

Assembly of a functional complex requires the participation of a number of additional transcription factors. Transcription of the 5S rRNA gene requires three additional factors; transcription of tRNA genes require two.

[The PolIII transcription machinery in S. cerevisiae - from Dr. Ian Willis homepage at Albert Einstein College of Medicine]

The following transcription factors have been characterized:

TFIIIA

This factor is required only for the transcription of 5S rRNA genes. It contains a single polypeptide with a Zn finger DNA-binding motif. It functions as an assembly factor for some class III promoters but not for all.

[MVH28-22]

 

TFIIIB

This factor contains three subunits, one of which is TBP - TATA-box binding protein. TFIIIB is a positional factor.

Look at Protein-DNA and Protein-Protein interactions in the TFIIIB-DNA Complex from the Hahn laboratory at the Fred Hutchinson Cancer Research Center in Seattle.

 

TFIIIC

TFIII consists of 6 subunits. It also functions as an assembly factor and appears to be required for all internal class III promoters.

 

Assembly of a transcription complex proceeds in a step-wise manner:

[Genetic Expression in Eucaryotes: DNA Transcription from Dr. Andre Sentenac's web page at CEA, Gif-sur-Yvette, France]

 

 

The following diagram illustrates these steps for the 5S rRNA promoter:

[MVH_28-21]

RNA processing

The initial RNA molecules that are transcribed in the nucleus must often be processed after being synthesized and before leaving the nucleus en route to the ribosomes located in the cytoplasm. Three types of modification are made:

[25-18]

 

 

[figure from melissa moore's home page] [Box-25-3]

 


RESOURCE MATERIAL
VOET, VOET & PRATT
  1. Chapter 21, Mammalian Fuel Metabolism: Integration and Regulation, pages 677 - 683
  2. Chapter 25, Transcription and RNA Processing, pages 822 - 830
  3. Chapter 27, Regulation of Gene Expression, pages 912 - 914
STRYER
  1. Chapter 33, RNA Synthesis and Splicing, pages 853-858
  2. Chapter 37, Eukaryotic Chromosomes and Gene Expression, pages 998-1002
LEHNINGER
  1. Chapter 24, RNA Metabolism, pages 863 - 865
  2. Chapter 27, Regulation of Gene Expression, pages 974 - 977
TAMARIN
  1. Chapter 10, pages 250 - 254
  2. Chapter 15, pages 421 - 422
WEB SITES

Format and Original Material © Martin E. Mulligan, 1996-2003