Alfisol, Item 003: LMMpro, version 2.0
The Langmuir Optimization Program
plus
The Michaelis-Menten Optimization Program



Correlation Coefficient (r2)

When there is no pattern to a set of data, we can always determine the average value of the data. In this way, we'd be predicting too high half the time, and too low the rest of the time. In principle, we'll never be too far from the correct answer, and the error in our answer will be small. Without a pattern, the best we can do is y = yaverage for any value of x, where the probable error is expressed by the standard deviation (σ).

On the other hand, if there is a pattern, then we should be able to track it. The simplest pattern is a line, y = mx + b, where m = slope of the line, and b= y-intercept of the line when x = 0. We justify that the data follow a linear pattern by showing that using the line results in less error than using a simple arithmetic average. This is known as the goodness-of-fit of the line, and it is numerically expressed by the correlation coefficient (r2):


r2 =1 - Σ (yi - yp)2
Σ (yi - ya)2

where,
yp = the predicted value when x = xi,
ya = arithmetic mean value of y for all values of x, and
yi = the experimental value measured when x = xi.

If the predicted line has less error than the average value, then its distance from the measured value will usually be smaller than the distance between the average value and the measured value. Summed across all the points, it will definitely yield a smaller number, the numerator will be smaller than the denominator, and the value of r2 will be close to 1.00. A perfect fit by a line will have r2 = 1.00, whereas a poor fit by a line will have a low value. If r2 = 0, then the predicted line is no better than a simple average of the data collected. With all linear predictions r2 < 1, always.

This is easy enough to do by hand. It is even easier today with computers that do it for us.