glmnet error for logistic regression/binomial

groutgauss picture groutgauss · May 1, 2015 · Viewed 10.1k times · Source

I get this error when trying to fit glmnet() with family="binomial", for Logistic Regression fit:

> data <- read.csv("DAFMM_HE16_matrix.csv", header=F)
> x <- as.data.frame(data[,1:3])
> x <- model.matrix(~.,data=x)
> y <- data[,4]

> train=sample(1:dim(x)[1],287,replace=FALSE)

> xTrain=x[train,]
> xTest=x[-train,]
> yTrain=y[train]
> yTest=y[-train]

> fit = glmnet(xTrain,yTrain,family="binomial")

Error in lognet(x, is.sparse, ix, jx, y, weights, offset, alpha, nobs,  : 
one multinomial or binomial class has 1 or 0 observations; not allowed

Any help would be greatly appreciated - I've searched the internet and haven't been able to find anything that helps

EDIT:

Here's what data looks like:

> data
          V1       V2    V3      V4
1   34927.00   156.60 20321  -12.60
2   34800.00   156.60 19811  -18.68
3   29255.00   156.60 19068    7.50
4   25787.00   156.60 19608    6.16
5   27809.00   156.60 24863   -0.87
...
356 26495.00 12973.43 11802    6.35
357 26595.00 12973.43 11802   14.28
358 26574.00 12973.43 11802    3.98
359 25343.00 14116.18 11802   -2.05

Answer

prahlad picture prahlad · Apr 18, 2016

I think it is because of the levels of your factor variable. Suppose there are 10 levels and your 1 level has only one record, try to remove this level. You can use drop levels from gdata package.