I am doing statistical analysis for a dataset using GLM in R. Basically the predictor variables are: "Probe"(types of probes used in the experiment - Factor with 4 levels), "Extraction"(types of extraction used in the experiment - Factor with 2 levels), "Tank"(the tank number that the sample is collected from - integers from 1 to 9), and "Dilution"(the dilution of each sample - numbers: 3.125, 6.25, 12.5, 25, 50, 100). The response is the number of positive responses ("Positive") obtained from a number of repetition of the experiment ("Rep"). I want to assess the effects of all predictor variables (and their interactions) on the number of positive responses, so I tried to fit a GLM model like this:
y<-cbind(mydata$Positive,mydata$Rep - mydata$Positive)
model1<-glm(y~Probe*Extraction*Dilution*Tank, family=quasibinomial, data=mydata)
But I was later advised by my supervisor that the "Tank" predictor variable should not be treated as a level-based variable. i.e. it has values of 1 to 9, but it's just the tank label so the difference between 1 and, say, 7 is not important. Treating this variable as factor would only make a large model with bad results. So how to treat the "Tank" variable as a random factor and include it in the GLM?
Thanks
It is called a "mixed effect model". Check out the lme4 package.
library(lme4)
glmer(y~Probe + Extraction + Dilution + (1|Tank), family=binomial, data=mydata)
Also, you should probably use + instead of * to add factors. * includes all interactions and levels of each factor, which would lead to a huge overfitting model. Unless you have a specific reason to believe that there is interaction, in which case you should code that interaction explicitly.