UPGMA vs. WPGMA

The method used in this example is called WPGMA (weighted pair group method with averaging) because the distance between clusters is calculated as a simple average.  For example, in the last step the WPGMA distance between (AB) and C+(DE) =  (55 + 90) / 2 = 72.5 . Though computationally easier, when there are unequal numbers of taxa in the clusters, the distances in the original matrix do not contribute equally to the intermediate calculations.

A superior method is UPGMA (unweighted PGMA), in which averages are weighted by the number of taxa in each cluster at each step. This makes the calculation slightly more complicated. For example, in the last step the UPGMA distance between (AB) and C+(DE) =  (55 + 2x90) / 3 = 78.33, because the distance is the average of three distances, (AB) to C and to D and to E . As a result, each distance contributes equally to the final result.

Note that the terms weighted and unweighted refer to the final result, not the math by which it is achieved. Thus the simple averaging in WPGMA produces a weighted result, and the proportional averaging in UPGMA produces an unweighted result.

Text material © 2007 by Steven M. Carr   [with thanks to Nell Lund]