is it somehow possible to conduct a linear regression for every single row of a data frame without using a loop? The output (intercept + slope) of the trend line should be added to the original data frame as new columns.
To make my intention more clearly, I have prepared a very small data example:
day1 <- c(1,3,1)
day2 <- c(2,2,1)
day3 <- c(3,1,5)
output.intercept <- c(0,4,-1.66667)
output.slope <- c(1,-1,2)
data <- data.frame(day1,day2,day3,output.intercept,output.slope)
Input variables are day1-3; let's say those are the sales for different shops on 3 consecutive days. What I want to do is to calculate a linear trend line for the 3 rows and add the output parameters to the origin table (see output.intercept + output.slope) as new columns.
The solution should be very efficient in terms of calculation time since the real data frame has many 100k's of rows.
Best, Christoph
design.mat <- cbind(1,1:3)
response.mat <- t(data[,1:3])
reg <- lm.fit(design.mat, response.mat)$coefficients
data <- cbind(data, t(reg))
# day1 day2 day3 output.intercept output.slope x1 x2
#1 1 2 3 0.00000 1 0.000000 1
#2 3 2 1 4.00000 -1 4.000000 -1
#3 1 1 5 -1.66667 2 -1.666667 2
However, if you have massive data, it might be necessary to loop due to memory restrictions. If that's the case I would use a long format data.table and use the package's by
syntax to loop.