Limits of prediction from correlation & regression analyses

    The plot compares unadjusted final exam marks for all persons who took two courses in successive Fall [Biol3250 - Genetics] and Winter [Biol3900 - Evolution] semesters, from the same instructor. These marks are highly correlated: r2 = 0.51. The regression line with a slope of r = 0.7 uses Fall 3250 marks (X) to predict marks in Winter 3900 (Y). Note the broad trend for final marks in the two courses to match:

    (1) No-one with an A in 3250 got less than a B in 3900,
    (2) Most people with a B in 3250 got a B in 3900,
    (3) No-one with less than a C in 3250 got better than a C in 3900.

    However, despite the high correlation, there is considerable scatter about the regression line. Notice that persons who got a B in 3250 got marks ranging from D ~ A  in 3900, and those who got C have a range of D ~ B in 3900. That is, despite the high correlation, there is a wide range of variation in individual performance.


All text material © 2015 by Steven M. Carr