Hello all.

Beginning with the HP-65, I've noticed that in lieu of the Linear Regression component, r (correlation coefficient), all of the application programs from the 55 onward calculate a coefficient of determination (r^{2}) instead. Since -1<=r^{2}<=1, as opposed to -1<=r<=1, why is this coefficient more important than r itself?

*Edited: 4 Apr 2012, 5:08 p.m. *

It is typical to use the r^{2} for the coeffient of correlation. This makes the term always positive and still less than 1. I have rarely seen r utilized.

Quote:

Since -1<=r^{2}<=1, as opposed to -1<=r<=1, why is this coefficient more important than r itself?

I sorry, but I am in trouble following you.

You may have type 0 <= R^{2} <= 1 and -1 <= r <= +1 ?

-- Correlation --

Correlation (r) is a measure of association between two variables. The variables are not designated as dependent or independent.

The value of a correlation coefficient can vary from minus one to plus one.

A minus one indicates a perfect negative correlation (the variables work opposite each other), while a plus one indicates a perfect positive correlation (the variables move together) . A correlation of zero means there is no relationship between the two variables (the variables are free).

-- Regression coefficient ---

Simple regression is used to examine the relationship between one dependent and one independent variable. Regression goes beyond correlation by adding prediction capabilities.

The coefficient of determination (r-squared) is the square of the correlation coefficient. Its value may vary from zero to one.

It has the advantage over the correlation coefficient in that it may be interpreted directly as the proportion of variance in the dependent variable that can be accounted for by the regression equation.

For example, an r-squared value of .49 means that 49% of the variance in the dependent variable can be explained by the regression equation. The other 51% is unexplained.

*Edited: 4 Apr 2012, 6:03 p.m. *

My goof. Yes, r^{2} is always between 0 and 1. My typographical error. Sorry for the mixup.

And thank you for the detailed descriptions. Very helpful as I'm trying to get a grasp of that coefficient and its cousins.

*Edited: 4 Apr 2012, 6:23 p.m. *

Matt,

The advent of more powerful tools like Excel's Data Analysis Toolkit, has popularized the ANOVA (Analysis of Variance) tables for regression results. These ANOVA tables do include R-Sqr and its adjusted value (useful when comparing models with different number of terms) AND ... the F statistic. While R-Sqr is a good cursory measure of the goodness of fit, I rely more on the F statistic to assess the usefulness of the regression results. Also the Excel Data Analysis Toolik produces the standard errors and 95% confidence intervals for the regression coefficients. This information can shed some lights if the results are quite reliable or shaky! These results help in identifying terms that can be thrown out to give a better regression model.

When I wrote my stat pac for the HP-35s (click here to go to the software's web page), I included the regression ANOVA table as parts of the results for linear regression.

Namir

Thanks. I'll check it out.

Following on from the above descriptions,

In linear regression we can calculate r = slope * sd_x / sd_y

and r is then commonly used as a standardized regression coefficient, i.e., how many sds of y are predicted for a 1 sd change in x. This can be useful when the scales of measurement don't have a natural interpretation and as a measure of linear association, the Pearson correlation coefficient of 2 variables.

r^2 = is often used as a summary measure of explained variance, but the sign of r gives information about the direction of association.

For a single coefficient, F = r^2 / (1-r^2) * (n-2) = t^2,

where t = coef/se is the t-test for the addition of the coefficient to the model, i.e., that the slope differs from zero.

So at the level of the simple linear regressions of the HP calculators (i.e., containing 1 predictor) these quantities are related to one another and some can be considered redundant, e.g, why would one compute an ANOVA for a linear regression with a single variable, if t is provided by a package? --- significance tests using F or t will give the same p-value.

Nick

*Edited: 5 Apr 2012, 5:38 a.m. *