Multilevel Modeling

Nested Data

Nested data is data in which a hierarchical structure is present. We may think of:

Students in Classes in Schools
Neighbourhoods in Cities in Countries
Measurements in Individuals

Characteristic of these data is that two students within the same school, neighbourhoods in the same city and measurements in the same individuals are more alike than two students in different schools, etc.

We may use multilevel models to estimate such nested data. To use these models in R the package lme4 must be installed. This package considers linear mixed models.

Intuitive example

Pupils in classes are rated on their popularity. The students’ extraversion and gender and the teachers’ experience were measured as predictor variables (Hox, J.J. (2002). Multilevel analysis: Techniques and applications. Hove: Routledge).

First we fit an ‘empty’ baseline model (this is also called an intercept-only model):

\[\text{popularity}_{ij} = \beta_{0j} + e_{ij} \] \[\beta_{0j} = \gamma_{00} +u_{0j}\]

require(haven)
require(lme4)
Popular = read_spss("popular.sav") 
fit <- lmer(popular ~  (1 | class),
            data = Popular, REML=T)

summary(fit)

## Linear mixed model fit by REML ['lmerMod']
## Formula: popular ~ (1 | class)
##    Data: Popular
## 
## REML criterion at convergence: 6330.5
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.5655 -0.6975  0.0020  0.6758  3.3175 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  class    (Intercept) 0.7021   0.8379  
##  Residual             1.2218   1.1053  
## Number of obs: 2000, groups:  class, 100
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  5.07786    0.08739    58.1

Adding first level predictors

\[\text{popularity}_{ij} = \beta_{0j} + \beta_1\text{gender}_{ij} + \beta_2\text{extraversion}_{ij} + e_{ij} \] \[\beta_{0j} = \gamma_{00} +u_{0j}\]

fit <- lmer(popular ~ gender + extrav + (1 | class),
            data = Popular, REML=T)

summary(fit)

## Linear mixed model fit by REML ['lmerMod']
## Formula: popular ~ gender + extrav + (1 | class)
##    Data: Popular
## 
## REML criterion at convergence: 4948.3
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.2091 -0.6575 -0.0044  0.6732  2.9755 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  class    (Intercept) 0.6272   0.7919  
##  Residual             0.5921   0.7695  
## Number of obs: 2000, groups:  class, 100
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  2.14096    0.11729   18.25
## gender       1.25300    0.03743   33.48
## extrav       0.44161    0.01616   27.33
## 
## Correlation of Fixed Effects:
##        (Intr) gender
## gender -0.100       
## extrav -0.705 -0.085

Difference with previous models

So far, we had models wherein we would use covariates to predict / estimate an effect on a certain outcome.

In these models, each case has an individual effect and an individual error.
The errors were (until now) assumed to be independently distributed over cases.

Now we assume that errors are not independent, but are governed by some clustering structure. E.g. students in schools.

Instead of:

\[ y_i=\beta_{0}+e_i \]

We have:

\[ y_{ij} = \beta_{0j}+e_{ij} \] \[ \beta_{0j} = \gamma_{00} + u_{0j} \]

Intercept variance

There are two variance components for the intercept only and the model with fixed-effects gender and extraversion:

                                IO          FE  
Between classes  (class)        0.7021      0.6272     
Between students (Residual)     1.2218      0.5921

Adding second level predictors

\[\text{popularity}_{ij} = \beta_{0j} + \beta_1\text{gender}_{ij} + \beta_2\text{extraversion}_{ij} + e_{ij} \] \[\beta_{0j} = \gamma_{00} + \gamma_{01}\text{experience}_{j} + u_{0j}\]

fit <- lmer(popular ~ gender + extrav + texp + (1 | class),
            data = Popular, REML=T)

## Linear mixed model fit by REML ['lmerMod']
## Formula: popular ~ gender + extrav + texp + (1 | class)
##    Data: Popular
## 
## REML criterion at convergence: 4885
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.1745 -0.6491 -0.0075  0.6705  3.0078 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  class    (Intercept) 0.2954   0.5435  
##  Residual             0.5920   0.7694  
## Number of obs: 2000, groups:  class, 100
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept) 0.809766   0.169993   4.764
## gender      1.253800   0.037290  33.623
## extrav      0.454431   0.016165  28.112
## texp        0.088407   0.008764  10.087
## 
## Correlation of Fixed Effects:
##        (Intr) gender extrav
## gender -0.040              
## extrav -0.589 -0.090       
## texp   -0.802 -0.036  0.139

These variances are nested. That is why they decrease when moving up in levels.

Fixed vs Random effects

Fixed effects constant across units

Random effects vary across units, interest in underlying population, effect is a realized value of a random variable

\[Y_{j} = MVN(\beta{}X + b_{j}Z_{j}, \Sigma)\]

\[b_{j} = MVN(0, \Omega)\]

In the two previous models we just had one random intercept. The model matrix, \(Z_{j}\) for the random effects of class \(j\) with 5 students (rows) looks like:

##      [,1]
## [1,]    1
## [2,]    1
## [3,]    1
## [4,]    1
## [5,]    1

We can get model matrices in lme4 by using the function getME().

Random Slopes

We add a random slope for extraversion; the effect of extraversion differs per class.

\[\text{popularity}_{ij} = \beta_{0j} + \beta_{1}\text{gender}_{ij} + \beta_{2j}\text{extraversion}_{ij} + e_{ij} \] \[\beta_{0j} = \gamma_{00} + \gamma_{01}\text{experience}_{j} + u_{0j}\] \[\beta_{2j} = \gamma_{20} + u_{2j}\]

fit2 <- lmer(popular ~ gender + extrav + texp + (1 + extrav| class),
             data = Popular, REML=T)

The model matrix for \(Z_{j}\) for the random effects of class \(j\) with now looks like:

##      [,1] [,2]
## [1,]    1    3
## [2,]    1    4
## [3,]    1    1
## [4,]    1    5
## [5,]    1    3

where the extraversion scores of the five students are in the second column.

summary(fit2)

## Linear mixed model fit by REML ['lmerMod']
## Formula: popular ~ gender + extrav + texp + (1 + extrav | class)
##    Data: Popular
## 
## REML criterion at convergence: 4834.8
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.1768 -0.6475 -0.0235  0.6648  2.9684 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr 
##  class    (Intercept) 1.30296  1.1415        
##           extrav      0.03455  0.1859   -0.89
##  Residual             0.55209  0.7430        
## Number of obs: 2000, groups:  class, 100
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept) 0.736154   0.196586   3.745
## gender      1.252254   0.036573  34.240
## extrav      0.452620   0.024614  18.389
## texp        0.090975   0.008685  10.475
## 
## Correlation of Fixed Effects:
##        (Intr) gender extrav
## gender -0.031              
## extrav -0.717 -0.057       
## texp   -0.688 -0.039  0.086

Slope variance

We now have three variance components:

Between classes  (Intercept)     1.30
                  extrav         0.03
Between students (Residual)      0.55

Cross-level interaction

To explain the slope variance we may add an interaction between extraversion and a second level predictor

\[\text{popularity}_{ij} = \beta_{0j} + \beta_{1}\text{gender}_{ij} + \beta_{2j}\text{extraversion}_{ij} + e_{ij} \]

\[\beta_{0j} = \gamma_{00} + \gamma_{01}\text{experience}_{j} + u_{0j}\]

\[\beta_{2j} = \gamma_{20} + \gamma_{21}\text{experience}_{j} + u_{2j}\]

fit3 <- lmer(popular ~ gender + extrav*texp + (1 + extrav| class),
             data = Popular, REML=T)

summary(fit3)

## Linear mixed model fit by REML ['lmerMod']
## Formula: popular ~ gender + extrav * texp + (1 + extrav | class)
##    Data: Popular
## 
## REML criterion at convergence: 4780.5
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -3.12872 -0.63857 -0.01129  0.67916  3.05006 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr 
##  class    (Intercept) 0.478639 0.69184       
##           extrav      0.005409 0.07355  -0.64
##  Residual             0.552769 0.74348       
## Number of obs: 2000, groups:  class, 100
## 
## Fixed effects:
##              Estimate Std. Error t value
## (Intercept) -1.209607   0.271901  -4.449
## gender       1.240698   0.036233  34.243
## extrav       0.803578   0.040117  20.031
## texp         0.226197   0.016807  13.458
## extrav:texp -0.024728   0.002555  -9.679
## 
## Correlation of Fixed Effects:
##             (Intr) gender extrav texp  
## gender       0.002                     
## extrav      -0.867 -0.065              
## texp        -0.916 -0.047  0.801       
## extrav:texp  0.773  0.033 -0.901 -0.859

By investigating the variances we see how much slope variance was explained by adding the cross-level interaction:

print(VarCorr(fit2), comp="Variance")

##  Groups   Name        Variance Corr  
##  class    (Intercept) 1.302964       
##           extrav      0.034545 -0.885
##  Residual             0.552090

print(VarCorr(fit3), comp="Variance")

##  Groups   Name        Variance Corr  
##  class    (Intercept) 0.478639       
##           extrav      0.005409 -0.640
##  Residual             0.552769

\[\frac{0.034547-0.0054094}{0.034547} \approx 0.84 \]

Restricted vs Full ML Estimation

To investigate model fit by means of the deviance we need to switch our estimation method:

fit <- lmer(popular ~ gender + extrav + texp + (1 | class),
            data = Popular, REML=F)
fit2 <- lmer(popular ~ gender + extrav + texp + (1 + extrav| class),
             data = Popular, REML=F)
fit3 <- lmer(popular ~ gender + extrav*texp + (1 + extrav| class),
             data = Popular, REML=F)

Comparing Models

We can now compare the models with first and second level predictors, random slope and cross-level interaction by means of \(\chi^2\) tests on the deviance:

anova(fit, fit2, fit3)

## Data: Popular
## Models:
## fit: popular ~ gender + extrav + texp + (1 | class)
## fit2: popular ~ gender + extrav + texp + (1 + extrav | class)
## fit3: popular ~ gender + extrav * texp + (1 + extrav | class)
##      Df    AIC    BIC  logLik deviance  Chisq Chi Df Pr(>Chisq)    
## fit   6 4874.3 4907.9 -2431.2   4862.3                             
## fit2  8 4828.8 4873.6 -2406.4   4812.8 49.489      2  1.794e-11 ***
## fit3  9 4765.6 4816.0 -2373.8   4747.6 65.183      1  6.827e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Obtaining confidence intervals/p-values

This is removed from lme4 because there is no general reliable way to estimate the intervals under every possible condition. See ?pvalues for more detailed info.