UPGMA vs. WPGMA

    The algorithm used in this example is technically called WPGMA (Weighted Pair Group Method with Averaging), because the distance between pairs of clusters is calculated as a simple average.  For example, in the last step the WPGMA distance between (AB) and C+(DE) =  (55 + 90) / 2 = 72.5 . Though computationally simpler, because there are unequal numbers of taxa in the clusters, the distances in the original matrix do not contribute equally to the intermediate calculations.The calculations are therefore weighted.

    A superior method is UPGMA (unweighted PGMA), in which averages are weighted by the number of taxa in each cluster at each step. This makes the calculation slightly more complicated. For example, in the last step the UPGMA distance between (AB) and C+(DE) =  [55 + (2 x 90)] / 3 = 78.33, because the distance is the average of three distances, (AB) to C, and to D, and to E . As a result, each distance contributes equally to the final result, and are unweighted.

    Note that the terms weighted and unweighted refer to the final result, not the math by which it is achieved. Counter-intuitively, the simple averaging in UPGMA produces a weighted result, and the proportional averaging in WPGMA produces an unweighted result. Online programs almost invariably calculate a WPGMA and call it a UPGMA.


Text material © 2025 by Steven M. Carr   [with thanks to Nell Lund]