Molecules
provide an independent estimate of phylogeny
Avodis
a circular argument:
Morphology
is used to create a classification,
then
the classification is interpreted to explain evolution
Ex.:
Chinese
water deer (Hydropotes)
are the only antlerless deer
=>
placed in a separate subfamily
&
assumed to be ancestral type
But,
molecular analysis shows antlers were lost secondarily
Molecules
provide large numbers of characters for analysis
Homo
has
ca. 200 bones and 3,000,000,000 nucleotide pairs
Typical
morphological study involves <100 characters
Typical
molecular study involves >1,000
Patterns
of molecular evolution are understood
Transitions (Ts) are more frequent than Transversions (Tv)
[recall
Bio2250: Ts
= C
T or A
G,
Tv
everything else
'silent'
>> 'replacement' substitutions
3rd
position >> 2nd & 1st substitutions (usually)
Relative
importance of characters is easier to judge
Is
the # of toes more important than # of teeth?
Are
scales versus feathers more important than # of temporal
openings?
But:
Any
one nucleotide position is more or less like any other
1. Defining the systematic problem: "Is the Giant Panda a bear or a raccoon?"
Evolutionary
relationships
of the Giant Panda (Ailuropoda)
Ailuropoda
has
been considered to be either a bear (Ursidae)
or
a raccoon (Procyonidae)
General
morphology suggests ursid ancestors:
Details
of skull, diet, biogeography seem to suggest procyonid ancestors
Ex.:
alar
canal is present in Ursidae
(including
Ailuropoda),
absent in Procyonidae, except lesser
panda (Ailurus)
2. Collecting the data:
Measure homologous
characters
in a set of taxa:
with DNA,
each nucleotide position is a separate character
mitochondrial
DNA (mtDNA)
is
used in many systematic studies
"Small
circular molecule ...", 16Kbp, maternally-inherited (cytoplasmic)
13 protein
loci (fast), 2 rDNA & 22 tDNA genes (slow), control region (very
fast)
'Universal
primers' permit PCR & DNA sequencing from many taxa
cytochrome
b
gene is widely used:
Large
data base for comparison
1140
bp in most vertebrates; we examine 401 bp in lab
3. Analyzing the data:
Phenetic
(how
similar are taxa?)
versus
cladistic
(how
closely related are taxa?) criteria
These criteria agree, iff
rates
of evolution are constant
If
evolutionary
rates differ, closely related organisms may appear different
Ex.:
Crocodiles
are
more closely related to birds, but more similar to
lizards
Crocodiles
resemble lizards more than birds
because
birds rapidly evolved specializations for flight
A. Phenetic analysis
Simplest
measure is % sequence similarity (S)
p-distance = (1 - S) x 100
Patterns
of similarity can be inferred from cluster analysis
Most
widely used is UPGMA
[Unweighted Pair Group Method
with Averaging],
a
Sequential Agglomerative Hierarchical Nesting (SAHN)
algorithm
[algorithm = a set of instructions
for doing a repetitive task]
In
(n) x (n) matrix, join the most similar pair
re-calculate
(n-1) x (n-1) matrix, re-join,
and
so on, until last pair is joined
Results
are show as a phenogram:
a
diagram of phenetic relationships
UPGMA
method assumes that rates of evolution are equal
so
branch tips "come out even"
(contemporaneous)
Some
alternatives:
Neighbour-Joining
(NJ) analysis does not assume rate equality
large
evolutionary rate differences lead to incorrect
trees
NJ
allows branch lengths proportional to change: tips
come out uneven
[algorithm
joins nodes, rather than tips]
This
method is more realistic, computationally harder
[see
www.megasoftware.net for
free software]
Differential
weighting of nucleotide substitutions
accord
greater 'significance' to 'important' changes
Ex.:
Kimura 2-parameter distance
(K2P) model treats Ts &
Tv
separately
K
transition
bias
= [Ts] / [Tv]
There
are twice as many kinds
of transversions as transitions:
expected K = 0.5
But: Tv are rare for close comparisons,
more
common for distant relationships
K is variable according to the evolutionary problem under
consideration:
K > 6 for close comparisons
B. Cladistic Analysis
Principles
of homology & analogy can be
applied to nucleotide
changes
We
rely only on shared derived (synapomorphic) nucleotide
sites,
&
avoid shared ancestral (symplesiomorphic) nucleotide
sites,
and
changes unique to single taxa (autapomorphies),
and
convergent nucleotides between unrelated taxa.
Choice of
preferred
hypothesis is made on the Principle of Parsimony
In
general: parsimony means that the simpler hypothesis is
to be preferred
complex
hypotheses are less probable
Evolutionary
parsimony:
a
hypothesis that requires fewer character changes is preferred
Ex.: to explain the origin of a complex structure
it
is more parsimonious to hypothesize that it has evolved only once
In
molecular systematics, these changes are nucleotide substitutions
[DNA mutations]
The "Four-Taxon
Problem" and the "Three-Taxon Statement":
Among
four taxa A, B, C, & D, there are three hypotheses
of relationship:
either
A is most closely related to B, or to C, or
to D
We
want to be able to evaluate hypotheses of the form:
"X
and Y are more closely related to each other than either is to Z"
The
alternative hypotheses can be shown as networks with branches
and an internode


Types 1
- 4 are uninformative:
They
give no information about relationships, because
all
hypotheses require the same number of changes,
so
none is more parsimonious than the
others.
Type 1 is invariant: no changes are required.
Type
2
indicates only that one taxon is unique wrt
the
others:
all
hypotheses require a single nucleotide change.
Type
3
indicates that all taxa are distinct & unique:
all
hypotheses require three nucleotide changes.
Type
4
indicates that two taxa are similar,
but
not whether this similarity is ancestral or derived:
shared
a could be either
hypothesis
requires two changes
Alternative
hypotheses also require two nucleotide changes.
[a
'+' indicates a change along a particular network branch]

Types 5, 6
&
7 are informative:
They
give information about relationships,
because
one hypothesis requires fewer changes than the others
&
is therefore more parsimonious than the others
Type
5
indicates
that A & B are most closely related:
The
first hypothesis can explain the distribution of nucleotides with a single
change,
the
latter two require two changes each.
[See also lab #5]
The
first
hypothesis is a more parsimonious
explanation
of the data than the others.

By the same logic:
Type
6
indicates
that A & C are most closely related.
Type
7
indicates
that A & D are most closely related.
[Homework:
for the three networks above,
sketch the changes required by sites of types 6 & 7]
A cladistic
analysis
counts the number of
informative
characters favouring each hypothesis
The
hypothesis
with the "highest score"
requires
the fewest changes
and
is therefore the 'most parsimonious' explanation.
This
is also called the 'minimum length' solution.
Cladistic analyses
may
also be weighted:
Ex.:
Count Tv:Ts as 3:1
Tv are 3x as meaningful
or, count Tv only (Transversion parsimony)
for
"deep" analyses
or, count 1st & 2nd position substitutions >> 3rd
C. Placing the root & Inferring the direction of evolutionary change
Suppose the first
hypothesis
(A & B are most closely related) is most
parsimonious
Ex.:
In
Lab #5, we found that the majority
of sites were of type #5. We said:
"Ailuropoda
&
Ursus
are
more closely related to each other
than
either is to Procyon (or Martes)."
The
hypothesis can be drawn as an
unrooted
network
But:
this
evidence can also be used to argue
"Procyon & Martes are more closely related to
each
other
than
either is to Ursus (or Ailuropoda)."
To
resolve this, we need to know where their common ancestor fits in.
There are four branches and
one
internode in this network
An evolutionary tree is
a network with a root:
The root
indicates the relationship with the common ancestor
A
'root' can be placed on any of the branches or the internode.
So,
there are five possible rooted
trees for this unrooted network.
All
are equally parsimonious:
not
all place A & B as each other's closest relatives.
Some
of these make shared charactes symplesiomorphic
There are several ways to placement the root
(1) Outgroup
rooting:
Include
a taxon that is known to be less closely related
to
any of the ingroup taxa than they
are to each other.
Such
a taxon is called an outgroup or
sister
taxon.
Ex.:
Lynx
(Feloidea)
is an outgroup to the Canoidea
(Note
that this tree is equivalent to the NJ phenogram)
(2) Midpoint
rooting:
Place
the root halfway between
the
two most different taxa.
This
assumes that molecular evolution is clock-like.
(Here,
this places the root on the internode).
(3) Character
Polarity:
If
the character state of the ancestor is known (or can be inferred).
Root
the tree accordingly
Use
of polarity is usually not possible with molecular data
Any
nucleotide can mutate to any other, in either direction
any
a c g t looks exactly like any
other
a
c g t
[Some
models allow for differential probabilities of mutation]
Homologous
nucleotide in ancestor has most likely mutated
Use
of polarity with morphological data is standard
Ex.: In an analysis of the evolution of the number of heart
chambers
in
codfish
(2), lizard (3), crocodile (4), & bird
(4)
we
know that the evolutionary order is 2
3
4
(this
is called a transformation series)
The root will be placed on the
codfish
branch,
because
we know the codfish most resembles the ancestor.
Crocs
& Birds have a recent common ancestor with a four-chambered heart.
D. Statistical tests determine
confidence in branching order
Bootstrap Analysis: a re-sampling
technique
statistical
tests usually involve obtaining replicates / repeating experiment
this
is inconvenient with DNA data: $$$
Suppose
existing data set (401bp) is a random sample of parametric data
set (complete genome)
re-sample existing n sites 1000 times, repeat
phylogenetic
analysis:
how
often do same clades / clusters appear?
"50% bootstrap support"
indicates particular group
occurs
more frequently than all others combined
95% criterion is desirable, not often obtained with small
data sets
What does this analysis explain about the biology & evolution of Pandas?
1.
Ailuropoda
and
Ursus
are
each others' closest relatives:
The
Giant Panda is a highly derived bear, not a
raccoon.
Ailuropoda
should be classified in Ursidae.
2. Similarities
of
Ailuropoda and Ailurus
are
convergent
(analogous):
these
represent parallel feeding specializations.
Ex.:
"Hypertrophied masticatory apparatus" permits feeding on bamboo:
(expanded zygomatic arch, high
mandibular
ramus, and molariform teeth)
Jaw
articulation above toothrow gives mechanical advantage:
(similar
modifications occur in Hyaena for crushing bones).
3.
Some
similarities between Ailuropoda and other ursids are
ancestral
homologies:
Bears
(including pandas) have short gestation and tiny
neonates.
In
most bears, gestation & birth occur during winter hibernation:
Hypothesis:
early parturition (birth)
gives access to milk, when no other food is available
But:
Pandas
do not hibernate, young are carried during foraging:
Why
have altricial (underdeveloped) young when food is readily
available?
| "Small young could be explained if the suite of physiological and behavioural adaptations associated with the production of small neonates were established before splitting of the panda and ursid lines." (Ramsay & Dunbrack, 1987) |
That is, tiny neonates are a conserved ancestral condition rather than a a contemporary adaptive response.
4. Panda
evolution may have proceeded quickly:
Ailuropoda
&
Ursus
are
about as similar genetically as dog & fox.
Development
& growth of cranial versus axial skeleton in pandas
resembles Hyaenas and boxer dogs:
heavy
crania, less-developed post-cranial (axial) skeletons.
Selection
may operate on similar, hypothetical 'growth fields'
| "The basic adaptive transition from Ursus to Ailuropoda required the changing of very few genetic messages [during an] origin by way of a very small population occupying a local bamboo forest." (Stanley 1979) |
Pandas may be a textbook
case of quantum speciation:
the
origin of a new adaptive type in one or a few speciation events.
Text material © 2010 by Steven M. Carr