What is the difference between Multiple R-squared and Adjusted R-squared in a single-variate least squares regression?

fmark picture fmark · May 20, 2010 · Viewed 134.9k times · Source

Could someone explain to the statistically naive what the difference between Multiple R-squared and Adjusted R-squared is? I am doing a single-variate regression analysis as follows:

 v.lm <- lm(epm ~ n_days, data=v)
 print(summary(v.lm))

Results:

Call:
lm(formula = epm ~ n_days, data = v)

Residuals:
    Min      1Q  Median      3Q     Max 
-693.59 -325.79   53.34  302.46  964.95 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  2550.39      92.15  27.677   <2e-16 ***
n_days        -13.12       5.39  -2.433   0.0216 *  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 410.1 on 28 degrees of freedom
Multiple R-squared: 0.1746,     Adjusted R-squared: 0.1451 
F-statistic: 5.921 on 1 and 28 DF,  p-value: 0.0216 

Answer

neilfws picture neilfws · May 20, 2010

The "adjustment" in adjusted R-squared is related to the number of variables and the number of observations.

If you keep adding variables (predictors) to your model, R-squared will improve - that is, the predictors will appear to explain the variance - but some of that improvement may be due to chance alone. So adjusted R-squared tries to correct for this, by taking into account the ratio (N-1)/(N-k-1) where N = number of observations and k = number of variables (predictors).

It's probably not a concern in your case, since you have a single variate.

Some references:

  1. How high, R-squared?
  2. Goodness of fit statistics
  3. Multiple regression
  4. Re: What is "Adjusted R^2" in Multiple Regression