Principles of Cell Biology (BIOL2060)

Department of Biology
Memorial University of Newfoundland

Regulation of Gene Expression

Prokaryotic Gene Regulation: Regulation of the lac operon (dual control: repression and promotion)

Example of prokaryotic gene control: the lac operon.
The best example of genetic control is the well studied system of milk sugar (lactose) inducible catabolism in the human symbiote, Escherichia coli.
The lac operon includes 3 structural genes (lacZ, lacY and lacA) that are transcribed in unison.
Located near the lac operon, is the lacI gene regulates the operon by producing the lac repressor protein.
Both the regulatory gene and the lac operon itself contain...
1) promoters (Pl and Plac) at which RNA polymerase binds and
2) terminators at which transcription halts.
Plac overlaps with the operator site (O) to which the active form of the repressor protein binds.
The operon is transcribed into a single long molecule of mRNA that codes for all three polypeptides.

Transcription of the lac operon is down-regulated through the binding of the lac repressor to the operator.
In the absence of lactose, the repressor remains bound to the operator and preventing access of the RNA polymerase to the promoter.
Transcription is blocked and the operon is repressed.
In the presence of lactose, the repressor is inactivated form and does not bind to the operator.
Thus the RNA polymerase may bind to the promoter and transcribe the structural genes into a single cistronic mRNA.
The isomeric form of lactose that binds to the repressor is allolactose.
The lac repressor is an allosteric protein capable of reversible conversion between two alternative forms.
In the absence of the effector allolactose, the protein is in the form that binds to the lac operator.
In the effector’s presence, the repressor mostly exists in the alternative and inactive state.

Transcription of the lac operon is up-regulated through the binding of the cAMP Receptor Protein (CRP) complex to the promoter.
The cAMP Receptor Protein (CRP) is an allosteric protein that is inactive in the free form but is activated by binding to cAMP.
The CRP-cAMP complex binds the promoter of inducible operons, increasing the affinity of the promoter for RNA polymerase to stimulate transcription.
The effect of active CRP on the lac operon.
1) The CRP-cAMP complex binds to the CRP recognition site near the promoter region, thereby
2) making the promoter more readily bound by RNA polymerase.
3) RNA polymerase binds to the promoter and transcribes the operon.

Prokaryotic Gene Regulation: Regulation of the trp operon (a "riboswitch")

An example of genetic control in prokaryotes: the trp operon
The trp operon includes five structural genes (trpE, trpD, trpC, trpB, and trpA) as well as promoter (Ptrp), operator (O), and leader (L) sequences.
The structural genes are transcribed and regulated as a unit.
The repressor protein, encoded by the trpR gene is inactive (cannot recognize the operator site) in the free form when tryptophan is not abundant.
The polycistronic mRNA encodes for the enzymes of the tryptophan biosynthetic pathway.
When complexed with tryptophan, the repressor is active and binds tightly to the operator, blocking access of RNA polymerase to the promoter and keeping the operon repressed.

In prokaryotes, no nuclear membrane separates transcription and translation and the ribosomes will bind the nascent message soon after it emerges from the RNA polymerase.
The close linkage of the processes can lead to interdependent control mechanisms such as the attenuation controlled by the trp leader sequence.
The transcript of the trp operon includes 162 nucleotides upstream of the initiation codon for trpE (the first structural gene).
This leader mRNA includes a section encoding a leader peptide (or sensor) of 14 amino acids.
In short, if tryptophan is present (in moderate amounts), the sensor peptide is easily made and the long trp operon mRNA is NOT completed.
If tryptophan is scarce, the leader peptide is not easily made and the full operon is transcribed then translated into tryptophan synthetic enzymes.
Two adjacent tryptophan (trp) codons within the leader mRNA sequence are essential in the operon's regulation.
The leader mRNA contains four regions capable of base pairing in various combinations to form hairpin structures.
Attenuation depends upon the ability of regions 1 and 2 and regions 3 and 4 of the trp leader sequence to base pair and form hairpin secondary structures.
A part of the leader mRNA containing regions 3 and 4 and a string of eight U's is called the attenuator.
The region 3+4 hairpin structure acts as a transcription termination signal; as soon as it forms, the RNA and the RNA polymerase are released from the DNA.
During periods of tryptophan scarcity, a ribosome translating the coding sequence for the leader peptide may stall when it encounters the two tryptophan (trp) codons because of the shortage of tryptophan-carrying tRNA molecules.
Because a stalled ribosome at this site blocks region 1, a region 1+2 hairpin cannot form and an alternative, region 2+3 hairpin is formed instead.
The region 2+3 base pairing prevents formation of the region 3+4 transcription termination hairpin and therefore RNA polymerase can move on to transcribe the entire operon to produce enzymes that will synthesize tryptophan.
When tryptophan is readily available, a ribosome can complete translation of the leader peptide without stalling.
As it pauses at the stop codon, it blocks region 2, preventing it from base pairing.
As a result, the region 3+4 structure forms and terminates transcription near the end of the leader sequence and the structural genes of the operon are not transcribed (nor translated).
This is example of a "riboswitch", a mechanism which can control transcription and translation through intereactions of molecules withe an mRNA.

Control of Eukaryotic Gene Regulation

Gene expression can be regulated by events that occur at many levels.
1) Genome (amplification or rearrangement of DNA segments, chromatin decondensation/condensation and DNA methylation).
2) Transcription.
3) Processing (and nuclear export) of RNA.
4) Translation (and targeting) of protein.
5) Posttranslational events (folding and assembly, cleavage, chemical group modifications and organelle import/secretion).
Degradation of mRNA and proteins are also subject to regulation.

Eukaryotic Gene Regulation: Genomic Control

Yeast mating-type switching depends upon the swapping of genetic cassettes to alter the DNA sequence.
Saccharomyces cerevisiae has mating-types (sexes): alpha or a.
Chromosome III contains three separate copies of the mating-type information.
The HMLalpha and HMRa loci contain complete copies of the alpha and a forms of the gene but the transcription of these loci is inhibited by the products of the SIR gene.
The cell's actual mating type is determined by the allele present at the MAT locus.
When a cell switches mating types, the alpha or a DNA at the MAT locus is removed and replaced by a DNA "cassette" copy of the alternative mating type DNA.

Genes coding for the human antibody heavy chains are created by DNA rearrangements involving multiple types of V, D and J segments.
Initially, the DNA of the immune cells is arranged as tandem arrays of V, D and J regions
DNA excision randomly removes several D and J segments to place individual D and J sequences side by side.
A second random excision removes several V and D segments to join a V section to the others to form a VDJ segment.
After transcription, the sequences separating the VDJ segment from the C segment are removed by RNA splicing.

Eukaryotic Gene Regulation: Transcriptional Control

Hybridization studies have demonstrated that differences between the populations of mRNA sequences in the cytoplasm of different kinds of cells reflect corresponding differences in nuclear RNA populations which result from differential transcription of genes.
Complementary DNA (cDNA) probes for tissue-specific transcripts are prepared by eliminating from the tissue's mRNA population the molecules also found in another tissue and  reverse-transcribing the remaining mRNA.
Hybridization studies suggest that different subsets of genes are transcribed in different tissues.
Tissue specific gene regulation is responsible for these differences.
DNA microarrays are used to profile gene expression patterns in various cells.

In a typical protein-coding eukaryotic gene, the mRNA is transcribed by RNA polymerase II.
The core promoter is characterized by an initiator sequence surrounding the transcriptional startpoint and a sequence called a TATA box located about 25 bp upstream (to the 5 prime side) of the startpoint.
The core promoter is where the general transcription factors and RNA polymerase assemble for the initiation of transcription.
Within about 100 nucleotides upstream from the core promoter lie several proximal control elements, which stimulate transcription of the gene by interacting with regulatory transcription factors.
The number, identity and location of the proximal elements vary from gene to gene.
The transcription unit includes a 5 prime untranslated region (leader) and a 3 prime untranslated region (trailer) which are transcribed and included in the mRNA but do not contribute sequence information for the protein product.
These untranslated regions may contain expression control sequences.
In the primary transcript, at the end of the last exon is a site directing the cleavage of the RNA and poly(A) addition.

Properties of Enhancers

Recombinant DNA techniques have been used to alter the orientation and location of DNA control elements to study the effect of the change on the level of transcription.
The core promoter alone (just upstream of gene) allows a basal level of transcription to occur.
When the core promoter is removed from the gene, no transcription occurs. An enhancer alone cannot substitute for the promoter region, but combining an enhancer with a core promoter results in a significantly higher level of transcription than occurs with the promoter alone.
This increase in transcription is observed when the enhancer is 1) moved farther upstream,  2)  inverted in orientation or 3) moved to the 3 prime side of the structural gene.

A Model for Enhancer Action
In this model, an enhancer located at a great distance along the DNA from the protein-coding gene it regulates is brought close to the core promoter by a looping of the DNA.
The influence of an enhancer on the promoter is mediated by regulatory transcription factors called activators.
1) The activator proteins bind to the enhancer elements, forming an enhanceosome.
2) Bending of the DNA brings the enhanceosome closer to the core promoter.
The general transcription factor TFIID is in the promoter's vicinity. For the purpose of this figure, two of the protein subunits of TFIID, which will function as coactivators in step 3, are distinguished from the rest of the factor.
3) The DNA-bound activators interact with specific coactivators that are part of TFIID. This interaction facilitates the correct positioning of TFIID on the promoter.
4) The other general transcription factors and RNA polymerase join the complex, and transcription is initiated.

The gene for the protein albumin, like other genes, is associated with an array of regulatory DNA elements; here we show only two control elements, as well as the core promoter.
Cells of all tissues contain RNA polymerase and the general transcription factors, but the set of regulatory transcription factors available varies with the cell type.
Liver cells contain a set of regulatory transcription factors that includes the factors for recognizing all the albumin gene control elements.
When these factors bind to the DNA, they facilitate transcription of the albumin gene at a high level.
Brain cells, however, have a different set of regulatory transcription factors, which does not include all the ones for the albumin gene. Consequently, in brain cells, the transcription complex can assemble at the promoter, but not very efficiently.
The result is that brain cells transcribe the albumin gene only at a low level.

Several structural motifs are commonly found in the DNA-binding domains of regulatory transcription factors.
The parts of these domains that directly interact with specific DNA sequences are usually alpha helices (or recognition helices) which fit into DNA's major groove.
The helix-turn-helix motif contains two alpha helices are joined by a short flexible turn.
The zinc finger motif consists of an alpha helix and a two segment, antiparallel beta sheet, all held together by the interaction of four cysteine residues (or two cysteine & two histidine residues) with a zinc atom.
Zinc finger proteins normally contain a number of zinc fingers.
The leucine zipper motif contains an alpha helix that has a regular arrangement of leucine residues that interacts with a similar region in a second polypeptide to coil around each other.
The helix-loop-helix motif contains a short a helix connected to a longer a helix by a polypeptide loop interacts with a similar region on another polypeptide to create a dimer.
The homeodomain is a helix-turn-helix DNA-binding domain containing three alpha helices encoded by a 180 basepair homeobox.
The homeodomain was originally found in homeotic genes which are very important in development.
The homeodomain Hox genes control the head-to-tail development in animals from flies to mammals.

The DNA response sequences that bind transcription factors are often comprised of inverted repeat elements.
Reading the sequence of the glucocorticoid response element in the 5 prime to 3 prime direction from either end yields the same DNA sequence (5 prime-AGAACA -3 prime).
The thyroid hormone element contains the same inverted repeat sequences as the estrogen element but the three bases that separate the two copies of the sequence in the estrogen element are absent.

The glucocorticoid receptors activate gene transcription.
Cortisol, a hydrophobic steroid hormone, can diffuse through a plasma membrane then bind to the intracellular glucocorticoid receptor.
Binding the steroid causes the release of an inhibitory protein and activates the glucocorticoid receptor molecule's DNA binding site.
The glucocorticoid receptor molecule then enters the nucleus and binds to a glucocorticoid response element in DNA which causes a second glucocorticoid receptor molecule to bind to the same response element.
The resulting glucocorticoid receptor dimer activates transcription of the target gene.

The cAMP response element binding protein (CREB) controls gene expression when cAMP levels increase.
Genes activated by cyclic AMP possess an upstream cyclic AMP response element (CRE) that binds a transcription factor called the CREB protein.
In the presence of cyclic AMP, cytoplasmic protein kinase A is activated and its activated catalytic subunit then moves into the nucleus, where it catalyzes phosphorylation of the CREB protein, thereby stimulating its activation domain.

Eukaryotic Gene Regulation: Translational Control

The antibody protein immunoglobulin M (IgM) exists in two forms, as secreted IgM and membrane-bound IgM.
These molecules, encoded by a single gene, differ in their heavy chain’s carboxyl ends.
The IgM gene has two possible poly(A) addition (termination) sites and a number of exons that can produce two alternative forms.
The plasma membrane-bound form contains a transmembrane anchor which is encoded by exons 5 and 6.
If a splice junction within exon 4 is used, exons 5 and 6 (carrying the anchor) are added to generate the IgM heavy chain.
The secreted product is produced when the exon 4  splice is not made and these transcripts are terminated just after exon 4.

Translation of ferritin is activated in the presence of iron.
Translation is inhibited by binding of the IRE-binding protein to the hairpin structure of an iron response element (IRE) in the 5 prime untranslated leader sequence of ferritin mRNA.
When iron binds to IRE-binding protein, it contorts into a conformation that does not recognize the IRE.
When iron is available, ribosomes can assemble on the mRNA and proceed to translate ferritin.
The hairpin does not interfere with the ribosome activities.

Degradation of the transferrin receptor mRNA (required for iron uptake) is also regulated by the allosteric IRE-binding protein.
Transferrin receptor mRNA has an IRE in its 3 prime untranslated region.
When intracellular [iron] is low, the IRE-binding protein remains bound to the IRE which 1) protects the mRNA from degradation and 2) allowing more transferrin receptor protein to be synthesized.
When intracellular [iron] is high, iron binds to the IRE-binding protein, it releases the IRE and the mRNA can be degraded.

RNA interference: siRNA plus microRNA

By RNA interference, short RNA's can lead to silencing the expression of genes that contain complementary sequences in their mRNA.
A compex of double-stranded RNA is cleaved into short fragments of 21-22 basepairs in length by the ribonuclease Dicer.
The fragments are siRNA's (short interfering RNA's).
The siRNA's bind to the RISC (RNA-induced silencing complex).
One of the strands of siRNA is degraded.
The remaining single-stranded siRNA , complexed with the RISC can then bind to complementary mRNA.
If a perfect or near perfect match, the mRNA is cleaved.
In additon, the RISC-siRNA complex can enter the nucleus, binds the genomic sequence and initiates a DNA methylation based chromatin condensation inactivation of the gene.

In a related mechanism, microRNAs (miRNAs) are gene products (mRNAs) that are 21-22 nucleotides in length.
The primary miRNAs are transcribed, form hair-pin structures and are cleaved by Drosha to make precursor microRNAs (roughly 70 nucleotides in length).
The pre-miRNAs are exported to the cytoplam where they are cleaved by dicer into the 21-22 nucleotide mature microRNA's.
The miRNA's form ribonucleoprotein complexes with mRNA's.
    If the match is exact, the mRNA is destroyed, similar to siRNA mechanisms.
    If the match is less-than-exact, then binding (usually of several miRNA's) inhibit translation.
Genes for miRNA's seem to make up 0.5-1.0% of the total number of genes in multicellular organisms.
    i.e. 200-250 miRNA genes in humans.

Eukaryotic Gene Regulation: Post-translational Control

Protein degradation control is not well understood.
In rats undergoing food deprivation the liver enzyme arginase is degraded more slowly than usual.
Note that most proteins degrade more rapidly during starvation.
Proteins can be marked for destruction by the addition of ubiquitin.
1) A protein targeted for degradation is bound at its N-terminus by a ubiquitinating enzyme complex.
2) In an ATP-dependent series of reactions, ubiquitin molecules are sequentially attached to the protein's lysine residues.
The ubiquitinating enzyme complex then detaches.
3) A proteasome degrades the ubiquitinated protein into short peptides.
The ubiquitin is released and can be recycled.

Notes prepared from Becker's World of the Cell, 9th edition
Hardin & Bertoni, 2015
Figures copyright of Pearson Education Inc.
email me at