Error while testing glm with gamma family

bhuss picture bhuss · Jun 19, 2013 · Viewed 7.4k times · Source

I'm currently working on toxicity of shellfish on six french bays in 10 years.. I created a proxy that represents the quantity of toxin that appeared during the year. Now i would like to explain this toxicity by various parameters. For the moment, i focus on studying the effect of microalgal blooms on the toxicity. In order to do that, i described each annual bloom: max of cells observed during the year, duration of the bloom, week of beginning etc...

as i have very little data on toxicity, i'd like to keep things really simple, so i'm testing glms on 1, 2, or 3 of these bloom parameters, and i allow interactions.

The code works well with log-normal family, but i would like to compare with a gamma distribution and even a quasi-gamma (the mean-variance works well on the graphic with var=2*mean²).

when i start with gamma, i get this message:

 Error: no valid set of coefficients has been found: please supply starting values
In addition: Warning message:
In log(ifelse(y == 0, 1, y/mu)) : NaNs produced

and it appears from the seventh model: a model with only one parameter: Proxy~DuréeB!

i read that i needed to find coef with start= argument, but I have 15 000 models to check, containting 1, 2 or 3 parameters. So here are my questions:

  1. why is it not working on such a simple model??
  2. Is there an element that i missed, an error in my code that explains the message?
  3. Is there a way to put starting coeffs for all of my models?

here are my code and datas:

output<-sapply(models,  function (x) { # x<-models[7]

    glm.01<-glm(as.formula(x), family=Gamma, data=TestGLM)

    Aic<-AIC(glm.01)  #+2*sum(log(TestGLM$Proxy))# correction for log normal AIC
    Dev<-(glm.01$null.deviance-glm.01$deviance)/(glm.01$null.deviance)
    c(Aic, Dev)

  })

> dput(TestGLM)

structure(list(Annee = c(2003L, 2004L, 2005L, 2006L, 2007L, 2008L, 
2009L, 2010L, 2011L, 2012L, 2003L, 2004L, 2005L, 2006L, 2007L, 
2008L, 2009L, 2010L, 2011L, 2012L, 2003L, 2004L, 2005L, 2006L, 
2007L, 2008L, 2009L, 2010L, 2011L, 2012L, 2003L, 2004L, 2005L, 
2006L, 2007L, 2008L, 2009L, 2010L, 2011L, 2012L, 2003L, 2004L, 
2005L, 2006L, 2007L, 2008L, 2009L, 2010L, 2011L, 2012L, 2003L, 
2004L, 2005L, 2006L, 2007L, 2008L, 2009L, 2010L, 2011L, 2012L
), Proxy = c(1, 1, 4.7, 72.6, 37.3, 32.5, 12.4, 164.2, 33.3, 
63.3, 0.85, 10.6, 37.7, 21.9, 37.4, 30.2, 38.3, 445.9, 57, 29.7, 
0.48, 48.4, 31.7, 0.85, 2.2, 0.85, 2.1, 1.9, 75.1, 287.9, 0.15, 
0.85, 0.85, 2.65, 1, 0.85, 1, 0.05, 1, 0.45, 0.15, 1, 1, 0.85, 
0.85, 0.85, 0.1, 194.85, 26.1, 21.8, 1, 46.5, 12, 4, 180.6, 31.7, 
11.7, 3.2, 6, 0.7), Libellé = structure(c(1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 
6L, 6L, 6L, 6L, 6L, 6L, 6L), .Label = c("BaiedeConcarneau", "BaiedeQuiberon", 
"BaiedeSeine", "BaiedeStBrieuc", "PertuisBreton", "RadedeBrest"
), class = "factor"), Max = c(462300, 1616150, 1378900, 9362300, 
718900, 5521866.667, 742400, 258550, 486700, 1360400, 58700, 
1200000, 2e+06, 255300, 252000, 614500, 1263900, 383100, 440833.3333, 
736700, 1173300, 218950, 971300, 468566.6667, 176300, 488400, 
696000, 28850, 65700, 575650, 11000, 8900, 13100, 36200, 133500, 
111300, 5500, 5500, 8900, 5800, 33800, 51500, 19100, 96700, 13800, 
13000, 4900, 450766.6667, 4200, 30700, 47900, 393950, 33000, 
62700, 325700, 1304000, 290100, 116600, 85600, 62300), QuinzaineMax = c(13L, 
13L, 7L, 9L, 19L, 13L, 12L, 9L, 15L, 13L, 12L, 12L, 7L, 12L, 
12L, 12L, 12L, 9L, 11L, 13L, 12L, 11L, 12L, 16L, 14L, 11L, 10L, 
15L, 17L, 11L, 12L, 16L, 11L, 20L, 11L, 11L, 14L, 15L, 9L, 6L, 
14L, 12L, 14L, 10L, 21L, 12L, 8L, 7L, 9L, 9L, 12L, 12L, 15L, 
15L, 19L, 10L, 12L, 17L, 10L, 15L), DébutB = c(11L, 13L, 7L, 
9L, 11L, 13L, 10L, 8L, 15L, 7L, 11L, 12L, 7L, 10L, 11L, 10L, 
12L, 9L, 11L, 7L, 12L, 11L, 11L, 15L, 13L, 10L, 9L, 4L, 7L, 10L, 
9L, 7L, 7L, 4L, 10L, 10L, 4L, 3L, 1L, 6L, 9L, 11L, 12L, 6L, 6L, 
1L, 1L, 6L, 1L, 8L, 11L, 12L, 7L, 10L, 9L, 10L, 8L, 10L, 10L, 
8L), FinB = c(14L, 15L, 15L, 9L, 19L, 13L, 18L, 26L, 15L, 14L, 
17L, 13L, 13L, 17L, 18L, 13L, 12L, 17L, 12L, 13L, 12L, 13L, 13L, 
16L, 21L, 11L, 10L, 16L, 18L, 17L, 22L, 17L, 20L, 20L, 11L, 11L, 
17L, 26L, 22L, 26L, 15L, 13L, 17L, 14L, 26L, 18L, 21L, 10L, 22L, 
12L, 16L, 19L, 21L, 18L, 19L, 10L, 14L, 20L, 18L, 18L), DuréeB = c(1L, 
1L, 1L, 1L, 4L, 1L, 3L, 3L, 1L, 1L, 2L, 2L, 2L, 1L, 2L, 4L, 1L, 
2L, 2L, 1L, 1L, 3L, 3L, 2L, 2L, 2L, 2L, 1L, 1L, 3L, 2L, 2L, 1L, 
1L, 2L, 2L, 5L, 2L, 3L, 4L, 1L, 3L, 4L, 1L, 1L, 1L, 3L, 2L, 1L, 
5L, 6L, 1L, 1L, 1L, 1L, 1L, 1L, 8L, 1L, 1L), AbCum = c(1232834.481, 
2295533.333, 2618400, 9540123.181, 2721369.612, 6402845.628, 
2131316.116, 1098695.022, 635295.5066, 2341036.687, 176743.6525, 
2008022.948, 4088103.498, 691735.877, 564020.5998, 1396076.209, 
1701772.35, 699439.2121, 667583.3333, 1274466.667, 1222289.094, 
501194.8149, 1882821.151, 645556.1372, 317050, 641859.2929, 1094461.592, 
102315.0749, 252271.4604, 1719625.936, 27751.20166, 23800, 56531.29739, 
90193.6705, 189335.9309, 178272.214, 15000, 26350, 47748.27689, 
25566.68521, 78063.53164, 97687.20404, 69380.95845, 188604.3622, 
48992.92008, 32453.63676, 29969.77734, 1003741.845, 20519.75684, 
83103.25321, 168584.9387, 481250.7282, 92726.1788, 128114.5066, 
811650, 1526211.659, 546353.7302, 440655.1676, 232500, 219290.8449
), NbBloom = c(2L, 2L, 3L, 1L, 2L, 1L, 3L, 4L, 1L, 3L, 3L, 1L, 
3L, 3L, 2L, 1L, 1L, 3L, 1L, 2L, 1L, 1L, 1L, 1L, 3L, 1L, 1L, 3L, 
4L, 3L, 4L, 4L, 4L, 4L, 1L, 1L, 2L, 4L, 5L, 5L, 2L, 1L, 2L, 3L, 
6L, 3L, 3L, 2L, 7L, 1L, 1L, 2L, 3L, 3L, 3L, 1L, 2L, 2L, 4L, 2L
)), .Names = c("Annee", "Proxy", "Libellé", "Max", "QuinzaineMax", 
"DébutB", "FinB", "DuréeB", "AbCum", "NbBloom"), class = "data.frame", row.names = c("4", 
"10", "16", "22", "28", "34", "40", "46", "52", "58", "5", "11", 
"17", "23", "29", "35", "41", "47", "53", "59", "1", "7", "13", 
"19", "25", "31", "37", "43", "49", "55", "2", "8", "14", "20", 
"26", "32", "38", "44", "50", "56", "6", "12", "18", "24", "30", 
"36", "42", "48", "54", "60", "3", "9", "15", "21", "27", "33", 
"39", "45", "51", "57"))

some of the models i'd like to test

models<-c("Proxy~ Libellé", "Proxy~ Annee", "Proxy~ Max", "Proxy~ QuinzaineMax", 
"Proxy~ DébutB", "Proxy~ FinB", "Proxy~ DuréeB", "Proxy~ AbCum", 
"Proxy~ NbBloom", "Proxy~ Libellé:Annee", "Proxy~ Libellé:Max", 
"Proxy~ Libellé:QuinzaineMax", "Proxy~ Libellé:DébutB", "Proxy~ Libellé:FinB", 
"Proxy~ Libellé:DuréeB", "Proxy~ Libellé:AbCum", "Proxy~ Libellé:NbBloom", 
"Proxy~ Annee:Max", "Proxy~ Annee:QuinzaineMax", "Proxy~ Annee:DébutB", 
"Proxy~ Annee:FinB", "Proxy~ Annee:DuréeB", "Proxy~ Annee:AbCum", 
"Proxy~ Annee:NbBloom", "Proxy~ Max:QuinzaineMax", "Proxy~ Max:DébutB", 
"Proxy~ Max:FinB", "Proxy~ Max:DuréeB", "Proxy~ Max:AbCum", "Proxy~ Max:NbBloom", 
"Proxy~ QuinzaineMax:DébutB", "Proxy~ QuinzaineMax:FinB", "Proxy~ QuinzaineMax:DuréeB", 
"Proxy~ Libellé+Libellé:Max", "Proxy~ Libellé+Libellé:QuinzaineMax", 
"Proxy~ Libellé+Libellé:DébutB", "Proxy~ Libellé+Libellé:FinB", 
"Proxy~ Libellé+Libellé:DuréeB", "Proxy~ Libellé+Libellé:AbCum", 
"Proxy~ Libellé+Libellé:NbBloom", "Proxy~ Libellé+Annee:Max", 
"Proxy~ Libellé+Annee:QuinzaineMax", "Proxy~ Libellé+Annee:DébutB", 
"Proxy~ Libellé+Annee:FinB", "Proxy~ Libellé+Annee:DuréeB", "Proxy~ Libellé+Annee:AbCum", 
"Proxy~ Libellé+Annee:NbBloom", "Proxy~ Libellé+Max:QuinzaineMax", 
"Proxy~ Libellé+Max:DébutB", "Proxy~ Libellé+Max:FinB", "Proxy~ Libellé+Max:DuréeB", 
"Proxy~ Libellé+Max:AbCum", "Proxy~ Libellé+Max:NbBloom", "Proxy~ Libellé+QuinzaineMax:DébutB", 
"Proxy~ Libellé+QuinzaineMax:FinB", "Proxy~ Libellé+QuinzaineMax:DuréeB", 
"Proxy~ Libellé+QuinzaineMax:AbCum", "Proxy~ Libellé+QuinzaineMax:NbBloom", 
"Proxy~ Libellé+DébutB:FinB", "Proxy~ Libellé+DébutB:DuréeB", 
"Proxy~ Libellé+DébutB:AbCum", "Proxy~ Libellé+DébutB:NbBloom", 
"Proxy~ Libellé+FinB:DuréeB", "Proxy~ Libellé+FinB:AbCum", "Proxy~ Libellé+FinB:NbBloom", 
"Proxy~ Libellé+DuréeB:AbCum", "Proxy~ Libellé+DuréeB:NbBloom", 
"Proxy~ Libellé+AbCum:NbBloom", "Proxy~ Annee+Max", "Proxy~ Annee+QuinzaineMax", 
"Proxy~ Annee+DébutB", "Proxy~ Annee+FinB", "Proxy~ Annee+DuréeB", 
"Proxy~ Annee+AbCum", "Proxy~ Annee+NbBloom", "Proxy~ Annee+Libellé:Annee", 
"Proxy~ Annee+Libellé:Max", "Proxy~ Annee+Libellé:QuinzaineMax", 
"Proxy~ Annee+Libellé:DébutB", "Proxy~ Annee+Libellé:FinB", "Proxy~ Annee+Libellé:DuréeB", 
"Proxy~ Annee+Libellé:AbCum", "Proxy~ Annee+Libellé:NbBloom", 
"Proxy~ Annee+Annee:Max", "Proxy~ Annee+Annee:QuinzaineMax", 
"Proxy~ Annee+Annee:DébutB", "Proxy~ Annee+Annee:FinB", "Proxy~ Annee+Annee:DuréeB", 
"Proxy~ Annee+Annee:AbCum", "Proxy~ Annee+Annee:NbBloom", "Proxy~ Annee+Max:QuinzaineMax", 
"Proxy~ Annee+Max:DébutB", "Proxy~ Annee+Max:FinB", "Proxy~ Annee+Max:DuréeB", 
"Proxy~ Annee+Max:AbCum", "Proxy~ Annee+Max:NbBloom", "Proxy~ Annee+QuinzaineMax:DébutB", 
"Proxy~ Libellé+Libellé:DuréeB+Max:DébutB", "Proxy~ Libellé+Libellé:DuréeB+Max:FinB", 
"Proxy~ Libellé+Libellé:DuréeB+Max:DuréeB", "Proxy~ Libellé+Libellé:DuréeB+Max:AbCum", 
"Proxy~ Libellé+Libellé:DuréeB+Max:NbBloom", "Proxy~ Libellé+Libellé:DuréeB+QuinzaineMax:DébutB", 
"Proxy~ Libellé+Libellé:DuréeB+QuinzaineMax:FinB", "Proxy~ Libellé+Libellé:DuréeB+QuinzaineMax:DuréeB", 
"Proxy~ Libellé+Libellé:DuréeB+QuinzaineMax:AbCum", "Proxy~ Libellé+Libellé:DuréeB+QuinzaineMax:NbBloom", 
"Proxy~ Libellé+Libellé:DuréeB+DébutB:FinB", "Proxy~ Libellé+Libellé:DuréeB+DébutB:DuréeB", 
"Proxy~ Libellé+Libellé:DuréeB+DébutB:AbCum", "Proxy~ Libellé+Libellé:DuréeB+DébutB:NbBloom", 
"Proxy~ Libellé+Libellé:DuréeB+FinB:DuréeB", "Proxy~ Libellé+Libellé:DuréeB+FinB:AbCum", 
"Proxy~ Libellé+Libellé:DuréeB+FinB:NbBloom", "Proxy~ Libellé+Libellé:DuréeB+DuréeB:AbCum", 
"Proxy~ Libellé+Libellé:DuréeB+DuréeB:NbBloom", "Proxy~ Libellé+Libellé:DuréeB+AbCum:NbBloom", 
"Proxy~ Libellé+Libellé:AbCum+Libellé:NbBloom", "Proxy~ Libellé+Libellé:AbCum+Annee:Max", 
"Proxy~ Libellé+Libellé:AbCum+Annee:QuinzaineMax", "Proxy~ Libellé+Libellé:AbCum+Annee:DébutB", 
"Proxy~ Libellé+Max:QuinzaineMax+QuinzaineMax:DébutB", "Proxy~ Libellé+Max:QuinzaineMax+QuinzaineMax:FinB", 
"Proxy~ Libellé+Max:QuinzaineMax+QuinzaineMax:DuréeB", "Proxy~ Libellé+Max:QuinzaineMax+QuinzaineMax:AbCum", 
"Proxy~ Libellé+Max:QuinzaineMax+QuinzaineMax:NbBloom", "Proxy~ Libellé+Max:QuinzaineMax+DébutB:FinB", 
"Proxy~ Libellé+Max:QuinzaineMax+DébutB:DuréeB", "Proxy~ Libellé+Max:QuinzaineMax+DébutB:AbCum", 
"Proxy~ Libellé+Max:QuinzaineMax+DébutB:NbBloom", "Proxy~ Libellé+Max:QuinzaineMax+FinB:DuréeB", 
"Proxy~ Libellé+Max:QuinzaineMax+FinB:AbCum", "Proxy~ Libellé+Max:QuinzaineMax+FinB:NbBloom", 
"Proxy~ Libellé+Max:QuinzaineMax+DuréeB:AbCum", "Proxy~ Libellé+Max:QuinzaineMax+DuréeB:NbBloom", 
"Proxy~ Libellé+Max:QuinzaineMax+AbCum:NbBloom", "Proxy~ Libellé+Max:DébutB+Max:FinB") 

thank you for your time!

Answer

Hong Ooi picture Hong Ooi · Jun 19, 2013

Note that the default link function for the Gamma family is the inverse link. If you're comparing with a lognormal model, which, if I understand correctly, means log-transforming the response and using linear regression, you probably want a log link (family=Gamma(link=log)).

In my experience the log link also tends to be less fragile, so this could well solve your convergence problem. Indeed, when I fit the model using the log link, it converged as expected.

It won't solve your other problem of trying to make something sensible out of fitting 15000 models, but that's another issue.