Sunday, October 14, 2012

Correlation and regression

When you are deciding which variables to try out in a multiple regression, looking at their correlation with the dependent variable is a good start. You can get the correlations from Excel, using Data Analysis, as I show in this Youtube.

Here is an example, where the dependent variable is Duration:


The number given is the 'r' value....Pearson Product Moment....which goes from - 1 to + 1. Mostly, a variable with an 'r' value of greater than + 0.7 or smaller than - 0.7 will work in the regression. So I would think that Length ( r = 0.779) would work. 

Points to note: careful of 'multicollinearity'....which is were the independent variables are explaining the same thing. Length doesn't seem to be collinear with the other independent variables (look along the row of Length).