Transforming data to normality. What is the best function for a given case?

Remi.b picture Remi.b · Aug 27, 2013 · Viewed 28.4k times · Source

Is there a function or a package that allows to look for the best (or one of the best) variable transformation in order to make model's residuals as normal as possible?


For example:

frml = formula(some_tranformation(A) ~ B+I(B^2)+B:C+C)
model = aov(formula, data=data)
shapiro.test(residuals(model))

Is there a function that tells what is the function some_transformation() that optimizes the normality of the residuals?

Answer

Roland picture Roland · Aug 27, 2013

You mean like the Box-Cox transformation?

library(car)
m0 <- lm(cycles ~ len + amp + load, Wool)
plot(m0, which=2)

enter image description here

# Box Cox Method, univariate
summary(p1 <- powerTransform(m0))
# bcPower Transformation to Normality 
# 
#    Est.Power Std.Err. Wald Lower Bound Wald Upper Bound
# Y1   -0.0592   0.0611          -0.1789           0.0606
# 
# Likelihood ratio tests about transformation parameters
#                              LRT df      pval
# LR test, lambda = (0)  0.9213384  1 0.3371238
# LR test, lambda = (1) 84.0756559  1 0.0000000


# fit linear model with transformed response:
coef(p1, round=TRUE)
summary(m1 <- lm(bcPower(cycles, p1$roundlam) ~ len + amp + load, Wool))
plot(m1, which=2)

enter image description here