I use R
only a little bit and never use data frames, which makes understanding the correct use of predict difficult. I have my data in plain matrices, not data frames, call them a
and b
, which are N x p
and M x p
matrices respectively. I can run the regression lm(a[,1] ~ a[,-1])
. I would like to use the resulting lm
object to predict b[,1]
from b[,-1]
. My naive guess of predict(lm(a[,1] ~ a[,-1]), b[,-1])
doesn't work. What's the right syntax to use the lm
to get a vector of predictions?
You can store a whole matrix in one column of a data.frame
:
x <- a [, -1]
y <- a [, 1]
data <- data.frame (y = y, x = I (x))
str (data)
## 'data.frame': 10 obs. of 2 variables:
## $ y: num 0.818 0.767 -0.666 0.788 -0.489 ...
## $ x: AsIs [1:10, 1:9] 0.916274.... 0.386565.... 0.703230.... -2.64091.... 0.274617.... ...
model <- lm (y ~ x)
newdata <- data.frame (x = I (b [, -1]))
predict (model, newdata)
## 1 2
## -3.795722 -4.778784
The paper about the pls package, (Mevik, B.-H. and Wehrens, R. The pls Package: Principal Component and Partial Least Squares Regression in R Journal of Statistical Software, 2007, 18, 1 - 24.) explains this technique.
Another example with a spectroscopic data set (quinine fluorescence), is in vignette ("flu")
of my package hyperSpec.