I have been receiving a the above message while trying to test the accuracy of my model. The plan was to predict the last 15 time points and compare them to the actual data for error values, but for some reason I got the "Variable Lengths Differ" error message.
This is using johnson and johnson data (data(jj)) from the astsa package. Here is the code and relevant errors-
> ##set up JJ data and time because its quarterly data
> X.all<-jj[1:84]
> t<-time(jj)
>
> values<-length(t)-15
> ts<-t[1:values]
> tsq<-ts^2/factorial(2)
> X<-X.all[1:values]
> year.first<-values+1
> year.last<-length(t)
> ##setting t for 15 values using quarterly idea
> new<-data.frame(ts=t[year.first:year.last])
> X.true<-X.all[(values+1):length(t)]
> fit1<-lm(X~ts+tsq)
> Xhat<-predict(fit1,new,se.fit=TRUE)
Error in model.frame.default(Terms, newdata, na.action = na.action, xlev = object$xlevels) :
variable lengths differ (found for 'tsq')
In addition: Warning message:
'newdata' had 15 rows but variables found have 69 rows
> X.hat<-round(Xhat$fit,2)
> error<-X.true-X.hat
The issue is that you're trying to call predict
with a newdata
argument that does not contain all of the variables used in your model. new
only contains ts
, not tsq
. You can solve this by:
new
that contains both ts
and tsq
, ORtsq
using I()
notation in your model specification, like: lm(X ~ ts + I(ts^2/factorial(2)))
. The I()
notation generates transformations automatically, so that you don't have to manually create power terms, etc. just to include them in your lm
specification.As an example, you could try this out with the iris
dataset to see how it works better than your current approach:
fit1 <- lm(Sepal.Length ~ Sepal.Width + I(Sepal.Width^2/factorial(2)), data = iris)
new <- data.frame(Sepal.Width = seq(1,5,by = 0.25))
predict(fit1, new)
We can compare this to your approach and observe the error you're encountering:
s2 <- I(iris$Sepal.Width^2/factorial(2))
fit1 <- lm(Sepal.Length ~ Sepal.Width + s2, data = iris)
new <- data.frame(Sepal.Width = seq(1,5,by = 0.25))
predict(fit1, new)
# Error in model.frame.default(Terms, newdata, na.action = na.action, xlev = object$xlevels) :
# variable lengths differ (found for 's2')
# In addition: Warning message:
# 'newdata' had 17 rows but variables found have 150 rows