Warning lme4: Model failed to converge with max|grad|

Valabe picture Valabe · Oct 28, 2018 · Viewed 14.8k times · Source

I have to run a lmer with a log transformed response variable, a continuous variable as fixed effect and and a nested random effect:

first<-lmer(logterrisize~spm + (1|studyarea/teriid),
            data = Data_table_for_analysis_Character_studyarea,
            control=lmerControl(optimizer="Nelder_Mead",
                                 optCtrl=list(maxfun=1e4)))

I got this error message: Error in length(value <- as.numeric(value)) == 1L : Downdated VtV is not positive definite

I tried this with bobyqa() as optimization argument and got this warning messages:

1: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : 
Model failed to converge with max|grad| = 0.753065 (tol = 0.002, component
1) 2: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,:
Model is nearly unidentifiable: very large eigenvalue-Rescale variables?

my summary looks like this:

Linear mixed model fit by REML ['lmerMod'] 
Formula: logterrisize ~ spm + (1 studyarea/teriid) Data: Data_table_for_analysis_Character_studyareaControl: lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 10000)) REML criterion at convergence: -6079.6Scaled residuals: 
   Min         1Q     Median         3Q        Max 
-3.639e-07 -4.962e-08  3.310e-09  5.293e-08  9.725e-07 
Random effects:
 Groups           Name        Variance  Std.Dev.
 teriid:studyarea (Intercept) 1.291e-01 3.593e-01
 studyarea        (Intercept) 1.944e-02 1.394e-01
 Residual                     4.506e-15 6.712e-08
Number of obs: 273, groups:  teriid:studyarea, 66; studyarea, 22
Fixed effects:
 Estimate Std. Error t value
(Intercept)  1.480e+00  5.631e-02   26.28
spm         -5.785e-16  8.507e-10    0.00
Correlation of Fixed Effects:
 (Intr) spm 0.000  convergence code: 0
Model failed to converge with max|grad| = 0.753065 (tol = 0.002, component1)
Model is nearly unidentifiable: very large eigenvalue - Rescale variables?

my data looks like following:

show(logterrisize) [1] 1.3317643 1.3317643 1.3317643 0.1295798 0.1295798 1.5051368 1.5051368 1.5051368 1.5051368 [10] 1.5051368 1.5051368 1.5051368 1.5051368 1.5051368 1.5051368 1.5051368 1.5051368 1.5051368 [19] 1.5051368 1.5051368 1.5051368 1.4665993 1.4665993 1.4665993 1.8282328 1.8282328 1.9252934 [28] 1.9252934 1.9252934 2.3006582 2.3006582 2.5160920 2.7774040 2.7774040 3.3398623 3.3398623 [37] 3.4759297 1.2563594 1.6061204 1.6061204 1.7835139 1.7835139 2.1669498 2.1669498 2.1669498 [46] 2.1669498 0.7264997 0.7458155 0.8380524 0.8380524 0.8380524 0.8380524 0.8380524 0.8380524

 show(spm)  [1] 18.461538 22.641509 35.172414 10.418006 15.611285  3.482143  3.692308  4.483986  4.821429 [10]  6.000000  6.122449  6.176471  6.220736  6.260870  6.593407  7.010309  9.200000  9.473684 [19]  9.600000 12.600000 14.200000 16.146179 28.125000 30.099010 13.731343 14.432990 11.089109 [28] 17.960526 32.903226  8.955224 33.311688  8.800000 11.578947 20.000000 14.455446 18.181818 [37] 28.064516 25.684211 17.866667 23.142857 18.208955 20.536913 11.419355 11.593220 12.703583 [46] 20.000000  3.600000 11.320755  6.200000  6.575342 12.800000 19.109589 20.124224 22.941176 [55]  4.600000  6.600000  6.771160  8.000000 19.200000 19.400000 22.773723  3.333333  4.214047

Studyarea are character names and teriID represents continuous numbers of the study sites.

My data frame looks like this:enter image description here

Did I forget anything to include to the equation while using a log transformed variable? Thanks!

EDIT: I used ?convergence to check on convergence errors. I tried this:

## 3. recompute gradient and Hessian with Richardson extrapolation

devfun <- update(first, devFunOnly=TRUE)
if (isLMM(first)) {
pars <- getME(first,"theta")
} else {## GLMM: requires both random and fixed parameters
pars <- getME(first, c("theta","fixef"))
}
if (require("numDeriv")) {
 cat("hess:\n"); print(hess <- hessian(devfun, unlist(pars)))
cat("grad:\n"); print(grad <- grad(devfun, unlist(pars)))
cat("scaled gradient:\n")
print(scgrad <- solve(chol(hess), grad))}

and got this answer:

hess:
      [,1]      [,2]
[1,] 147.59157 -14.37956
[2,] -14.37956 120.85329
grad:
[1] -222.1020 -108.1038
scaled gradient:
[1] -19.245584  -9.891077

Unfortunately I don´t know what the answer should tell me.

2nd EDIT:

I tried numerous optimizer and while using this:

first<-lmer(logterrisize~spm + (1|studyarea/teriid),REML=FALSE,
            data = Data_table_for_analysis_Character_studyarea,
            control=lmerControl(optimizer="optimx",
                                 optCtrl=list(method='nlminb')))

I only got one warning: In optwrap(optimizer, devfun, getStart(start, rho$lower, rho$pp), : convergence code 1 from optimx

now my summary looks like this:

Linear mixed model fit by maximum likelihood  ['lmerMod']
Formula: logterrisize ~ spm + (1 | studyarea/teriid)
Data: Data_table_for_analysis_Character_studyarea
Control: lmerControl(optimizer = "optimx", optCtrl = list(method ="nlminb"))
AIC      BIC   logLik deviance df.resid 
 -3772.4  -3754.3   1891.2  -3782.4      268 
Scaled residuals: 
   Min         1Q     Median         3Q        Max 
-1.523e-04 -1.693e-05  1.480e-06  1.436e-05  3.332e-04 
Random effects:
Groups           Name        Variance  Std.Dev. 
teriid:studyarea (Intercept) 8.219e-02 0.2866882
studyarea        (Intercept) 7.478e-02 0.2734675
Residual                     3.843e-10 0.0000196
Number of obs: 273, groups:  teriid:studyarea, 66; studyarea, 22
Fixed effects:
Estimate Std. Error t value
(Intercept) 1.551e+00  7.189e-02   21.58
spm         3.210e-11  2.485e-07    0.00
Correlation of Fixed Effects:
(Intr)spm 0.000 
convergence code: 1

So may I able to turn a blind eye on this warning message or will this be a huge mistake?

Answer

Ben Bolker picture Ben Bolker · Oct 29, 2018

tl;dr every observation on a territory shares the same territory size, so your random effect of territory ID is essentially explaining everything and leaving no variation at all for either the log(terrsize) fixed effect or the residual variance. Leaving the random effect of territory ID out of the model seems to give reasonable answers; a simulated data set replicates this pathology pretty well, but suggests that you'll end up with an underestimate of the spm effect ...

read data and plot

library(readxl)
library(dplyr)

dd <- (read_excel("lme4_terr_dataset.xlsx")
    %>% rename(spm="scans per min",
               studyarea="Study areaID",
               teriid="TerritoryID",
               terrsize="Territory_Size")
)

library(ggplot2); theme_set(theme_bw())
library(ggalt)
(ggplot(dd, aes(spm,terrsize,colour=studyarea))
    +geom_point()
    +geom_encircle(aes(group=teriid))
    +theme(legend.position="none")
    + scale_y_log10()
)

enter image description here

This plot, with its horizontal lines of values from the same territory ID, is what helped me diagnose the problem. Confirming that every territory ID has a single territory size for all observations:

tt <- with(dd,table(terrsize,teriid))
all(rowSums(tt>0)==1)  ## TRUE

model fitting

library(lme4)
m1 <- lmer(log(terrsize) ~ spm + (1|studyarea/teriid), dd)
## replicate warnings    
m2 <- lmer(log(terrsize) ~ spm + (1|studyarea), dd)
## no warnings

now simulate similar-looking data

set.seed(101)
## experimental design: rep within f2 (terr_id) within f1 (study area)
ddx <- expand.grid(studyarea=factor(letters[1:10]),
                   teriid=factor(1:4),rep=1:5)
## study-area, terr_id effects, and spm
b_studyarea <- rnorm(10)
b_teriid <- rnorm(40)
ddx <- within(ddx, {
    int <- interaction(studyarea,teriid)
    spm <- rlnorm(nrow(ddx), meanlog=1,sdlog=1)
})
## compute average spm per terr/id
## (because response will be identical across id)
spm_terr <- aggregate(spm~int, data=ddx, FUN=mean)[,"spm"]
ddx <- within(ddx, {
    mu <- 1+0.2*spm_terr[int]+b_studyarea[studyarea] + b_teriid[int]
    tsize <- rlnorm(length(levels(int)), meanlog=mu, sdlog=1)
    terrsize <- tsize[int]
})
gg1 %+% ddx

enter image description here

fit simulated data

This gives similar behaviour to the real data:

lmer(log(terrsize) ~ spm + (1|studyarea/teriid), ddx)

We can avoid the warnings by dropping teriid:

m1 <- lmer(log(terrsize) ~ spm + (1|studyarea), ddx)

But the true effect of spm (0.2) will be underestimated (because of the ignored noise from teriid ...)

round(confint(m1, parm="beta_"),3)
##             2.5 % 97.5 %
## (Intercept) 1.045  2.026
## spm         0.000  0.070

aggregating

On the basis of this single simulation, it looks like aggregating to the level of the territory (as recommended e.g. by Murtaugh 2007, "Simplicity and complexity in ecological data analysis" Ecology) and weighting by number of samples per territory gives a reasonable estimate of the true spm effect ...

ddx_agg <- (ddx
    %>% group_by(studyarea,terrsize,teriid)
    %>% summarise(spm=mean(spm),
                  n=n())
)
library(nlme)
m3x <- lme(log(terrsize) ~ spm, random=~1|studyarea, data=ddx_agg,
          weights=varFixed(~I(1/n)))
round(summary(m3x)$tTab,3)
            Value Std.Error DF t-value p-value
(Intercept) 0.934     0.465 29   2.010   0.054
spm         0.177     0.095 29   1.863   0.073