I would like to extract the glmnet generated model coefficients and create a SQL query from them. The function coef(cv.glmnet.fit)
yields a 'dgCMatrix
' object. When I convert it to a matrix using as.matrix
, the variable names are lost and only the coefficient values are left behind.
I know one can print the coefficients in the screen, however is it possible to write the names to a data frame?
Can anybody assist to extract these names?
UPDATE: Both first two comments of my answer are right. I have kept the answer below the line just for posterity.
The following answer is short, it works and does not need any other package:
tmp_coeffs <- coef(cv.glmnet.fit, s = "lambda.min")
data.frame(name = tmp_coeffs@Dimnames[[1]][tmp_coeffs@i + 1], coefficient = tmp_coeffs@x)
The reason for +1 is that the @i
method indexes from 0 for the intercept but @Dimnames[[1]]
starts at 1.
OLD ANSWER: (only kept for posterity) Try these lines:
The non zero coefficients:
coef(cv.glmnet.fit, s = "lambda.min")[which(coef(cv.glmnet.fit, s = "lambda.min") != 0)]
The features that are selected:
colnames(regression_data)[which(coef(cv.glmnet.fit, s = "lambda.min") != 0)]
Then putting them together as a dataframe is staight forward, but let me know if you want that part of the code also.