Fixed effects in mixed models (lme4)


Hi all,
I’ve been using mixed model for my data and came up with an issue. I’m testing attacking behaviour of a spider and measuring initial response time, I have two fixed factors: chemical cue at three levels (CTL, PRD and NPR) and I measured each individual at seven consecutive days, so trials at seven levels (1, 2 … 7). I’ve also added the spiders’ id as a random effect.
Here is my model and the summary:

res=glmer(resp~cue*trial+(1|id), family = Gamma(link = “log”), data, na.action = na.omit)

Fixed effects:
                   Estimate Std. Error t value Pr(>|z|)    
(Intercept)  -2.933512   0.217584 -13.482   <2e-16 ***
cueNPR        0.489303   0.301143   1.625    0.104    
cuePRD        0.047111   0.297891   0.158    0.874    
trial         0.058917   0.035979   1.638    0.102    
cueNPR:trial -0.007851   0.049736  -0.158    0.875    
cuePRD:trial  0.033758   0.047423   0.712    0.477

So it calculates different level of the cue factor separately (NPR and PRD) and compare them with the first level (CTL), however I want one overall estimate for the cue effect. If I make a new column named ‘cue.2’ and instead of the level names (CTL, RPD, NPR) call them by numbers (1, 2, and 3), the model gives me what I want:

‘data.frame’: 324 obs. of 5 variables:
$ id : int 92 92 92 92 92 92 92 92 176 176 …
$ trial: int 1 2 3 4 5 6 7 8 1 2 …
$ cue : Factor w/ 3 levels “CTL”,“NPR”,“PRD”: 3 3 3 3 3 3 3 3 1 1 …
$ cue.2: int 2 2 2 2 2 2 2 2 1 1 …
$ resp : num 0.03 0.25 0.03 0.03 0.05 0.03 0.1 0.02 0.03 0.02 …

res2=glmer(resp~cue.2*trial+(1|id), family = Gamma(link = "log"), data, na.action = na.omit)

Fixed effects:
             Estimate Std. Error t value Pr(>|z|)    
(Intercept) -3.262103   0.001631 -2000.2  < 2e-16 ***
cue.2        0.249467   0.001629   153.2  < 2e-16 ***
trial        0.079490   0.001587    50.1  < 2e-16 ***
cue.2:trial -0.004731   0.001521    -3.1  0.00187 ** 

Am I doing the right thing? Given that the second model gives me strong significant p values.

Any comment would be much appreciated!



Hi Mohammad,

Assuming that ‘cue’ corresponds to three distinct treatments (i.e. it is a categorical variable), you should use cue rather than cue.2. When fitting the model using cue.2, you are assuming that there is a linear relationship between your response variable and a ‘continuous’ variable that takes only three values (1,2,3), but for a categorical variable, the numbering of levels is arbitrary (except for ordinal variables). Thus, in the second table of output, the parameter corresponding to cue.2 is a slope, but I suspect that is not what you want.

I would go further and say that you probably also want to convert trial to a categorical variable. As it stands, you are assuming that there is a linear change in the response variable with trial number.




Thanks Drew for your comment,
in the first table I don’t quite understand why the mixed model behaves like this and compare two levels (PRD and NPR) of the same categorical variable (cue) with the third level (CTL)?
Is there any way to get one single parameter estimation for a categorical variable that has more than two levels?



Hi Mohammad,

It’s not clear exactly what your over-arching goal is. Do you really want to estimate a parameter or are you trying to just get a sense of the categorical variables overall importance in the model (i.e more of an inferential statistics question)? If so, you can understand the overall significance of the cue “effect” using the following on nested models:

anova(mod1, mod2, test = “Chisq”)

where mod1 is the model with the cue predictor and mod2 is the same model but without the cue predictor variable. This will give you an overall “effect” for the categorical predictor; providing a chi-square test statistic and p-value (likelihood ratio test). I’m not sure that this is what your after. As Drew says, if this is a categorical variable (that’s not ordinal) then I don’t think an over-all effect for it is useful as you would naturally want to make predictions within these levels.

Anyway, sorry if that’s not what your after, but maybe just let us know what you want to do with the model and this variable (prediction or just understand it’s importance) and that might help direct you on the right path.



Hi Dan,
Yes I was looking for something more than just statistical inferences, something like a coefficient that shows an overall effect of the cue as a categorical variable. But you are absolutely right, my predictions should be based on the levels of the categorical variable. So I’m going to revisit my hypothesis and predictions again and calculate p values with the anova function.

Thank you for your comment,