Model constructors

The LinearMixedModel type represents a linear mixed-effects model. Typically, it is constructed from a Formula and an appropriate Table type, usually a DataFrame.

Examples of linear mixed-effects model fits

For illustration, several data sets from the lme4 package for R are made available in .arrow format in this package. Often, for convenience, we will convert these to DataFrames. These data sets include the dyestuff and dyestuff2 data sets.

using DataFrames, MixedModels, MixedModelsDatasets, StatsModels
dyestuff = MixedModelsDatasets.dataset(:dyestuff)
Arrow.Table with 30 rows, 2 columns, and schema:
 :batch  String
 :yield  Int16
describe(DataFrame(dyestuff))
2×7 DataFrame
Rowvariablemeanminmedianmaxnmissingeltype
SymbolUnion…AnyUnion…AnyInt64DataType
1batchAF0String
2yield1527.514401530.016350Int16

The @formula language in Julia

MixedModels.jl builds on the Julia formula language provided by StatsModels.jl, which is similar to the formula language in R and is also based on the notation from Wilkinson and Rogers (1973). There are two ways to construct a formula in Julia. The first way is to enclose the formula expression in the @formula macro:

StatsModels.@formulaMacro
@formula(ex)

Capture and parse a formula expression as a FormulaTerm struct.

A formula is an abstract specification of a dependence between left-hand and right-hand side variables as in, e.g., a regression model. Each side specifies at a high level how tabular data is to be converted to a numerical matrix suitable for modeling. This specification looks something like Julia code, is represented as a Julia Expr, but uses special syntax. The @formula macro takes an expression like y ~ 1 + a*b, transforms it according to the formula syntax rules into a lowered form (like y ~ 1 + a + b + a&b), and constructs a FormulaTerm struct which captures the original expression, the lowered expression, and the left- and right-hand-side.

Operators that have special interpretations in this syntax are

  • ~ is the formula separator, where it is a binary operator (the first argument is the left-hand side, and the second is the right-hand side.
  • + concatenates variables as columns when generating a model matrix.
  • & represents an interaction between two or more variables, which corresponds to a row-wise kronecker product of the individual terms (or element-wise product if all terms involved are continuous/scalar).
  • * expands to all main effects and interactions: a*b is equivalent to a+b+a&b, a*b*c to a+b+c+a&b+a&c+b&c+a&b&c, etc.
  • 1, 0, and -1 indicate the presence (for 1) or absence (for 0 and -1) of an intercept column.

The rules that are applied are

  • The associative rule (un-nests nested calls to +, &, and *).
  • The distributive rule (interactions & distribute over concatenation +).
  • The * rule expands a*b to a+b+a&b (recursively).
  • Subtraction is converted to addition and negation, so x-1 becomes x + -1 (applies only to subtraction of literal 1).
  • Single-argument & calls are stripped, so &(x) becomes the main effect x.
source

The second way is to combine Terms with operators like +, &, ~, and others at "run time". This is especially useful if you wish to create a formula from a list a variable names. For instance, the following are equivalent:

@formula(y ~ 1 + a + b + a & b) == (term(:y) ~ term(1) + term(:a) + term(:b) + term(:a) & term(:b))
true

MixedModels.jl provides additional formula syntax for representing random-effects terms. Most importantly, | separates random effects and their grouping factors (as in the formula extension used by the R package lme4). Much like with the base formula language, | can be used within the @formula macro and to construct a formula programmatically:

@formula(y ~ 1 + a + b + (1 + a + b | g))
FormulaTerm
Response:
  y(unknown)
Predictors:
  1
  a(unknown)
  b(unknown)
  (a,b,g)->(1 + a + b) | g
terms = sum(term(t) for t in [1, :a, :b])
group = term(:g)
response = term(:y)
response ~ terms + (terms | group)
FormulaTerm
Response:
  y(unknown)
Predictors:
  1
  a(unknown)
  b(unknown)
  (1 + a + b | g)

Models with simple, scalar random effects

A basic model with simple, scalar random effects for the levels of batch (the batch of an intermediate product, in this case) is declared and fit as

fm = @formula(yield ~ 1 + (1|batch))
fm1 = fit(MixedModel, fm, dyestuff)
Linear mixed model fit by maximum likelihood
 yield ~ 1 + (1 | batch)
   logLik   -2 logLik     AIC       AICc        BIC    
  -163.6635   327.3271   333.3271   334.2501   337.5307

Variance components:
            Column    Variance Std.Dev.
batch    (Intercept)  1388.3332 37.2603
Residual              2451.2500 49.5101
 Number of obs: 30; levels of grouping factors: 6

  Fixed-effects parameters:
────────────────────────────────────────────────
              Coef.  Std. Error      z  Pr(>|z|)
────────────────────────────────────────────────
(Intercept)  1527.5     17.6946  86.33    <1e-99
────────────────────────────────────────────────

You can also use the convenience function lmm to fit the model as follows:

fm = @formula(yield ~ 1 + (1|batch))
fm2 = lmm(fm, dyestuff)
Linear mixed model fit by maximum likelihood
 yield ~ 1 + (1 | batch)
   logLik   -2 logLik     AIC       AICc        BIC    
  -163.6635   327.3271   333.3271   334.2501   337.5307

Variance components:
            Column    Variance Std.Dev.
batch    (Intercept)  1388.3332 37.2603
Residual              2451.2500 49.5101
 Number of obs: 30; levels of grouping factors: 6

  Fixed-effects parameters:
────────────────────────────────────────────────
              Coef.  Std. Error      z  Pr(>|z|)
────────────────────────────────────────────────
(Intercept)  1527.5     17.6946  86.33    <1e-99
────────────────────────────────────────────────

Notice that both are equivalent.

(If you are new to Julia you may find that this first fit takes an unexpectedly long time, due to Just-In-Time (JIT) compilation of the code. The subsequent calls to such functions are much faster.)

using BenchmarkTools
dyestuff2 = MixedModelsDatasets.dataset(:dyestuff2)
@benchmark fit(MixedModel, $fm, $dyestuff2)
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
 Range (minmax):  208.729 μs  8.863 ms   GC (min … max): 0.00% … 91.77%
 Time  (median):     225.951 μs                GC (median):    0.00%
 Time  (mean ± σ):   241.100 μs ± 249.739 μs   GC (mean ± σ):  4.82% ±  4.57%

     ▅▇▆▃     ▄▇▄▃▁                                             
  ▁▃▇█████▇▆▆██████▇▆▆▇▆▅▅▄▃▃▄▄▅▅▅▄▃▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▃
  209 μs           Histogram: frequency by time          283 μs <

 Memory estimate: 131.07 KiB, allocs estimate: 1047.

By default, the model is fit by maximum likelihood. To use the REML criterion instead, add the optional named argument REML=true to the call to fit

fm1reml = fit(MixedModel, fm, dyestuff, REML=true)
Linear mixed model fit by REML
 yield ~ 1 + (1 | batch)
 REML criterion at convergence: 319.6542768422576

Variance components:
            Column    Variance Std.Dev.
batch    (Intercept)  1764.0503 42.0006
Residual              2451.2499 49.5101
 Number of obs: 30; levels of grouping factors: 6

  Fixed-effects parameters:
────────────────────────────────────────────────
              Coef.  Std. Error      z  Pr(>|z|)
────────────────────────────────────────────────
(Intercept)  1527.5     19.3834  78.80    <1e-99
────────────────────────────────────────────────

Floating-point type in the model

The type of fm1

typeof(fm1)
LinearMixedModel{Float64}

includes the floating point type used internally for the various matrices, vectors, and scalars that represent the model. At present, this will always be Float64 because the parameter estimates are optimized using the NLopt package which calls compiled C code that only allows for optimization with respect to a Float64 parameter vector.

So in theory other floating point types, such as BigFloat or Float32, can be used to define a model but in practice only Float64 works at present.

In theory, theory and practice are the same. In practice, they aren't. – Anon

Simple, scalar random effects

A simple, scalar random effects term in a mixed-effects model formula is of the form (1|G). All random effects terms end with |G where G is the grouping factor for the random effect. The name or, more generally the expression, G, should evaluate to a categorical array that has a distinct set of levels. The random effects are associated with the levels of the grouping factor.

A scalar random effect is, as the name implies, one scalar value for each level of the grouping factor. A simple, scalar random effects term is of the form, (1|G). It corresponds to a shift in the intercept for each level of the grouping factor.

Models with vector-valued random effects

The sleepstudy data are observations of reaction time, reaction, on several subjects, subj, after 0 to 9 days of sleep deprivation, days. A model with random intercepts and random slopes for each subject, allowing for within-subject correlation of the slope and intercept, is fit as

sleepstudy = MixedModelsDatasets.dataset(:sleepstudy)
fm2 = fit(MixedModel, @formula(reaction ~ 1 + days + (1 + days|subj)), sleepstudy)
Linear mixed model fit by maximum likelihood
 reaction ~ 1 + days + (1 + days | subj)
   logLik   -2 logLik     AIC       AICc        BIC    
  -875.9697  1751.9393  1763.9393  1764.4249  1783.0971

Variance components:
            Column    Variance Std.Dev.   Corr.
subj     (Intercept)  565.52071 23.78068
         days          32.68242  5.71685 +0.08
Residual              654.94015 25.59180
 Number of obs: 180; levels of grouping factors: 18

  Fixed-effects parameters:
──────────────────────────────────────────────────
                Coef.  Std. Error      z  Pr(>|z|)
──────────────────────────────────────────────────
(Intercept)  251.405      6.6323   37.91    <1e-99
days          10.4673     1.50224   6.97    <1e-11
──────────────────────────────────────────────────

Models with multiple, scalar random-effects terms

A model for the Penicillin data incorporates random effects for the plate, and for the sample. As every sample is used on every plate these two factors are crossed.

penicillin = MixedModelsDatasets.dataset(:penicillin)
fm3 = fit(MixedModel, @formula(diameter ~ 1 + (1|plate) + (1|sample)), penicillin)
Linear mixed model fit by maximum likelihood
 diameter ~ 1 + (1 | plate) + (1 | sample)
   logLik   -2 logLik     AIC       AICc        BIC    
  -166.0942   332.1883   340.1883   340.4761   352.0676

Variance components:
            Column   Variance Std.Dev. 
plate    (Intercept)  0.714993 0.845572
sample   (Intercept)  3.135139 1.770632
Residual              0.302426 0.549932
 Number of obs: 144; levels of grouping factors: 24, 6

  Fixed-effects parameters:
─────────────────────────────────────────────────
               Coef.  Std. Error      z  Pr(>|z|)
─────────────────────────────────────────────────
(Intercept)  22.9722     0.74459  30.85    <1e-99
─────────────────────────────────────────────────

In contrast, the cask grouping factor is nested within the batch grouping factor in the Pastes data.

pastes = DataFrame(MixedModelsDatasets.dataset(:pastes))
describe(pastes)
3×7 DataFrame
Rowvariablemeanminmedianmaxnmissingeltype
SymbolUnion…AnyUnion…AnyInt64DataType
1batchAJ0String
2caskac0String
3strength60.053354.259.366.00Float64

This can be expressed using the solidus (the "/" character) to separate grouping factors, read "cask nested within batch":

fm4a = fit(MixedModel, @formula(strength ~ 1 + (1|batch/cask)), pastes)
Linear mixed model fit by maximum likelihood
 strength ~ 1 + (1 | batch) + (1 | batch & cask)
   logLik   -2 logLik     AIC       AICc        BIC    
  -123.9972   247.9945   255.9945   256.7217   264.3718

Variance components:
                Column   Variance Std.Dev. 
batch & cask (Intercept)  8.433674 2.904079
batch        (Intercept)  1.199152 1.095058
Residual                  0.678000 0.823407
 Number of obs: 60; levels of grouping factors: 30, 10

  Fixed-effects parameters:
─────────────────────────────────────────────────
               Coef.  Std. Error      z  Pr(>|z|)
─────────────────────────────────────────────────
(Intercept)  60.0533    0.642135  93.52    <1e-99
─────────────────────────────────────────────────

If the levels of the inner grouping factor are unique across the levels of the outer grouping factor, then this nesting does not need to expressed explicitly in the model syntax. For example, defining sample to be the combination of batch and cask, yields a naming scheme where the nesting is apparent from the data even if not expressed in the formula. (That is, each level of sample occurs in conjunction with only one level of batch.) As such, this model is equivalent to the previous one.

pastes.sample = (string.(pastes.cask, "&",  pastes.batch))
fm4b = fit(MixedModel, @formula(strength ~ 1 + (1|sample) + (1|batch)), pastes)
Linear mixed model fit by maximum likelihood
 strength ~ 1 + (1 | sample) + (1 | batch)
   logLik   -2 logLik     AIC       AICc        BIC    
  -123.9972   247.9945   255.9945   256.7217   264.3718

Variance components:
            Column   Variance Std.Dev. 
sample   (Intercept)  8.433661 2.904077
batch    (Intercept)  1.199166 1.095064
Residual              0.678000 0.823407
 Number of obs: 60; levels of grouping factors: 30, 10

  Fixed-effects parameters:
─────────────────────────────────────────────────
               Coef.  Std. Error      z  Pr(>|z|)
─────────────────────────────────────────────────
(Intercept)  60.0533    0.642136  93.52    <1e-99
─────────────────────────────────────────────────

In observational studies it is common to encounter partially crossed grouping factors. For example, the InstEval data are course evaluations by students, s, of instructors, d. Additional covariates include the academic department, dept, in which the course was given and service, whether or not it was a service course.

insteval = MixedModelsDatasets.dataset(:insteval)
fm5 = fit(MixedModel, @formula(y ~ 1 + service * dept + (1|s) + (1|d)), insteval)
Linear mixed model fit by maximum likelihood
 y ~ 1 + service + dept + service & dept + (1 | s) + (1 | d)
    logLik     -2 logLik       AIC         AICc          BIC     
 -118792.7767  237585.5534  237647.5534  237647.5804  237932.8763

Variance components:
            Column   Variance Std.Dev. 
s        (Intercept)  0.105418 0.324681
d        (Intercept)  0.258417 0.508347
Residual              1.384728 1.176745
 Number of obs: 73421; levels of grouping factors: 2972, 1128

  Fixed-effects parameters:
────────────────────────────────────────────────────────────────
                              Coef.  Std. Error      z  Pr(>|z|)
────────────────────────────────────────────────────────────────
(Intercept)              3.27628      0.0793647  41.28    <1e-99
service: Y               0.0116045    0.0699321   0.17    0.8682
dept: D02               -0.0411091    0.120331   -0.34    0.7326
dept: D03                0.00967415   0.108411    0.09    0.9289
dept: D04                0.105017     0.0944965   1.11    0.2664
dept: D05                0.0828643    0.11148     0.74    0.4573
dept: D06               -0.01194      0.0978342  -0.12    0.9029
dept: D07                0.0992679    0.110598    0.90    0.3694
dept: D08                0.0575337    0.127935    0.45    0.6529
dept: D09               -0.00263179   0.107085   -0.02    0.9804
dept: D10               -0.223423     0.099838   -2.24    0.0252
dept: D11                0.0129817    0.110639    0.12    0.9066
dept: D12                0.00503826   0.0944243   0.05    0.9574
dept: D14                0.00508272   0.109041    0.05    0.9628
dept: D15               -0.0466719    0.101942   -0.46    0.6471
service: Y & dept: D02  -0.144352     0.0929527  -1.55    0.1204
service: Y & dept: D03   0.0174077    0.0927237   0.19    0.8511
service: Y & dept: D04  -0.0381263    0.0810901  -0.47    0.6382
service: Y & dept: D05   0.0596631    0.123952    0.48    0.6303
service: Y & dept: D06  -0.254044     0.080781   -3.14    0.0017
service: Y & dept: D07  -0.151634     0.11157    -1.36    0.1741
service: Y & dept: D08   0.0508941    0.112189    0.45    0.6501
service: Y & dept: D09  -0.259448     0.0899448  -2.88    0.0039
service: Y & dept: D10   0.25907      0.11137     2.33    0.0200
service: Y & dept: D11  -0.276577     0.0819621  -3.37    0.0007
service: Y & dept: D12  -0.041849     0.0792928  -0.53    0.5977
service: Y & dept: D14  -0.256742     0.0931016  -2.76    0.0058
service: Y & dept: D15   0.24042      0.0982071   2.45    0.0144
────────────────────────────────────────────────────────────────

Simplifying the random effect correlation structure

MixedModels.jl estimates not only the variance of the effects for each random effect level, but also the correlation between the random effects for different predictors. So, for the model of the sleepstudy data above, one of the parameters that is estimated is the correlation between each subject's random intercept (i.e., their baseline reaction time) and slope (i.e., their particular change in reaction time per day of sleep deprivation). In some cases, you may wish to simplify the random effects structure by removing these correlation parameters. This often arises when there are many random effects you want to estimate (as is common in psychological experiments with many conditions and covariates), since the number of random effects parameters increases as the square of the number of predictors, making these models difficult to estimate from limited data.

The special syntax zerocorr can be applied to individual random effects terms inside the @formula:

fm2zerocorr_fm = fit(MixedModel, @formula(reaction ~ 1 + days + zerocorr(1 + days|subj)), sleepstudy)
Linear mixed model fit by maximum likelihood
 reaction ~ 1 + days + zerocorr(1 + days | subj)
   logLik   -2 logLik     AIC       AICc        BIC    
  -876.0016  1752.0033  1762.0033  1762.3481  1777.9680

Variance components:
            Column   Variance Std.Dev.  Corr.
subj     (Intercept)  584.2547 24.1714
         days          33.6330  5.7994   .  
Residual              653.1157 25.5561
 Number of obs: 180; levels of grouping factors: 18

  Fixed-effects parameters:
──────────────────────────────────────────────────
                Coef.  Std. Error      z  Pr(>|z|)
──────────────────────────────────────────────────
(Intercept)  251.405      6.70769  37.48    <1e-99
days          10.4673     1.51931   6.89    <1e-11
──────────────────────────────────────────────────

Alternatively, correlations between parameters can be removed by including them as separate random effects terms:

fit(MixedModel, @formula(reaction ~ 1 + days + (1|subj) + (days|subj)), sleepstudy)
Linear mixed model fit by maximum likelihood
 reaction ~ 1 + days + (1 | subj) + (days | subj)
   logLik   -2 logLik     AIC       AICc        BIC    
  -876.0016  1752.0033  1762.0033  1762.3481  1777.9680

Variance components:
            Column   Variance Std.Dev.  Corr.
subj     (Intercept)  584.2547 24.1714
         days          33.6330  5.7994   .  
Residual              653.1157 25.5561
 Number of obs: 180; levels of grouping factors: 18

  Fixed-effects parameters:
──────────────────────────────────────────────────
                Coef.  Std. Error      z  Pr(>|z|)
──────────────────────────────────────────────────
(Intercept)  251.405      6.70769  37.48    <1e-99
days          10.4673     1.51931   6.89    <1e-11
──────────────────────────────────────────────────

Finally, for predictors that are categorical, MixedModels.jl will estimate correlations between each level. Notice the large number of correlation parameters if we treat days as a categorical variable by giving it contrasts:

fit(MixedModel, @formula(reaction ~ 1 + days + (1 + days|subj)), sleepstudy,
    contrasts = Dict(:days => DummyCoding()))
Linear mixed model fit by maximum likelihood
 reaction ~ 1 + days + (1 + days | subj)
   logLik   -2 logLik     AIC       AICc        BIC    
  -805.3991  1610.7982  1742.7982  1821.0637  1953.5334

Variance components:
            Column    Variance  Std.Dev.   Corr.
subj     (Intercept)   955.98284 30.91897
         days: 1       497.94627 22.31471 -0.30
         days: 2       917.16373 30.28471 -0.57 +0.75
         days: 3      1269.62101 35.63174 -0.37 +0.72 +0.87
         days: 4      1487.38580 38.56664 -0.32 +0.59 +0.67 +0.91
         days: 5      2299.83989 47.95665 -0.25 +0.46 +0.45 +0.70 +0.85
         days: 6      3856.04148 62.09703 -0.28 +0.30 +0.48 +0.70 +0.77 +0.75
         days: 7      1809.67762 42.54031 -0.16 +0.22 +0.47 +0.50 +0.63 +0.64 +0.71
         days: 8      3157.68383 56.19327 -0.20 +0.29 +0.36 +0.56 +0.73 +0.90 +0.73 +0.74
         days: 9      3078.27872 55.48224 +0.05 +0.26 +0.16 +0.38 +0.59 +0.78 +0.38 +0.53 +0.85
Residual                18.89497  4.34683
 Number of obs: 180; levels of grouping factors: 18

  Fixed-effects parameters:
───────────────────────────────────────────────────
                 Coef.  Std. Error      z  Pr(>|z|)
───────────────────────────────────────────────────
(Intercept)  256.652       7.35934  34.87    <1e-99
days: 1        7.84395     5.45556   1.44    0.1505
days: 2        8.71009     7.28375   1.20    0.2318
days: 3       26.3402      8.52255   3.09    0.0020
days: 4       31.9976      9.205     3.48    0.0005
days: 5       51.8666     11.396     4.55    <1e-05
days: 6       55.5264     14.708     3.78    0.0002
days: 7       62.0988     10.131     6.13    <1e-09
days: 8       79.9777     13.3239    6.00    <1e-08
days: 9       94.1994     13.1573    7.16    <1e-12
───────────────────────────────────────────────────

Separating the 1 and days random effects into separate terms removes the correlations between the intercept and the levels of days, but not between the levels themselves:

fit(MixedModel, @formula(reaction ~ 1 + days + (1|subj) + (days|subj)), sleepstudy,
    contrasts = Dict(:days => DummyCoding()))
Linear mixed model fit by maximum likelihood
 reaction ~ 1 + days + (1 | subj) + (days | subj)
   logLik   -2 logLik     AIC       AICc        BIC    
  -815.9006  1631.8012  1745.8012  1799.9980  1927.7998

Variance components:
            Column    Variance  Std.Dev.   Corr.
subj     (Intercept)   917.91162 30.29706
         days: 1       449.11133 21.19225   .  
         days: 2       842.56594 29.02699   .   +0.75
         days: 3      1202.17703 34.67242   .   +0.72 +0.88
         days: 4      1421.15767 37.69824   .   +0.58 +0.66 +0.92
         days: 5      2232.13683 47.24550   .   +0.45 +0.44 +0.70 +0.85
         days: 6      3777.16785 61.45867   .   +0.28 +0.48 +0.70 +0.78 +0.75
         days: 7      1756.91352 41.91555   .   +0.20 +0.46 +0.49 +0.62 +0.64 +0.71
         days: 8      3090.59943 55.59316   .   +0.27 +0.35 +0.56 +0.73 +0.90 +0.73 +0.74
         days: 9      3048.06900 55.20932   .   +0.25 +0.15 +0.38 +0.59 +0.78 +0.38 +0.53 +0.86
Residual                35.64042  5.96996
 Number of obs: 180; levels of grouping factors: 18

  Fixed-effects parameters:
───────────────────────────────────────────────────
                 Coef.  Std. Error      z  Pr(>|z|)
───────────────────────────────────────────────────
(Intercept)  256.652       7.2784   35.26    <1e-99
days: 1        7.84395     5.37686   1.46    0.1446
days: 2        8.71009     7.12526   1.22    0.2215
days: 3       26.3402      8.41116   3.13    0.0017
days: 4       31.9976      9.10567   3.51    0.0004
days: 5       51.8666     11.3123    4.58    <1e-05
days: 6       55.5264     14.622     3.80    0.0001
days: 7       62.0988     10.078     6.16    <1e-09
days: 8       79.9777     13.2537    6.03    <1e-08
days: 9       94.1994     13.1642    7.16    <1e-12
───────────────────────────────────────────────────

(Notice that the variance component for days: 1 is estimated as zero, so the correlations for this component are undefined and expressed as NaN, not a number.)

An alternative is to force all the levels of days as indicators using fulldummy encoding.

fit(MixedModel, @formula(reaction ~ 1 + days + (1 + fulldummy(days)|subj)), sleepstudy,
    contrasts = Dict(:days => DummyCoding()))
Linear mixed model fit by maximum likelihood
 reaction ~ 1 + days + (1 + days | subj)
   logLik   -2 logLik     AIC       AICc        BIC    
  -805.3991  1610.7982  1764.7982  1882.5629  2010.6559

Variance components:
            Column    Variance  Std.Dev.   Corr.
subj     (Intercept)   640.46761 25.30746
         days: 0       526.67309 22.94936 -0.18
         days: 1       653.73096 25.56816 -0.20 +0.58
         days: 2       461.88299 21.49146 -0.28 +0.07 +0.65
         days: 3       417.46155 20.43188 +0.34 -0.35 +0.43 +0.64
         days: 4       459.57667 21.43774 +0.54 -0.51 +0.12 +0.08 +0.72
         days: 5      1266.35955 35.58595 +0.34 -0.31 +0.06 -0.11 +0.35 +0.71
         days: 6      2424.54679 49.23969 +0.28 -0.40 -0.18 -0.03 +0.40 +0.59 +0.58
         days: 7      1297.44409 36.02005 +0.23 +0.01 +0.04 +0.16 +0.11 +0.33 +0.42 +0.51
         days: 8      2049.38910 45.27018 +0.31 -0.28 -0.10 -0.17 +0.16 +0.52 +0.84 +0.57 +0.59
         days: 9      2515.80443 50.15780 +0.42 -0.02 +0.09 -0.21 +0.04 +0.43 +0.71 +0.13 +0.39 +0.81
Residual                19.14750  4.37579
 Number of obs: 180; levels of grouping factors: 18

  Fixed-effects parameters:
───────────────────────────────────────────────────
                 Coef.  Std. Error      z  Pr(>|z|)
───────────────────────────────────────────────────
(Intercept)  256.652       7.35899  34.88    <1e-99
days: 1        7.84395     5.45632   1.44    0.1506
days: 2        8.71009     7.28463   1.20    0.2318
days: 3       26.3402      8.52439   3.09    0.0020
days: 4       31.9976      9.20677   3.48    0.0005
days: 5       51.8667     11.397     4.55    <1e-05
days: 6       55.5265     14.7079    3.78    0.0002
days: 7       62.0988     10.1311    6.13    <1e-09
days: 8       79.9777     13.3241    6.00    <1e-08
days: 9       94.1994     13.1575    7.16    <1e-12
───────────────────────────────────────────────────

This fit produces a better fit as measured by the objective (negative twice the log-likelihood is 1610.8) but at the expense of adding many more parameters to the model. As a result, model comparison criteria such, as AIC and BIC, are inflated.

But using zerocorr on the individual terms does remove the correlations between the levels:

fit(MixedModel, @formula(reaction ~ 1 + days + zerocorr(1 + days|subj)), sleepstudy,
    contrasts = Dict(:days => DummyCoding()))
Linear mixed model fit by maximum likelihood
 reaction ~ 1 + days + zerocorr(1 + days | subj)
   logLik   -2 logLik     AIC       AICc        BIC    
  -882.9138  1765.8276  1807.8276  1813.6757  1874.8797

Variance components:
            Column    Variance Std.Dev.  Corr.
subj     (Intercept)   958.5364 30.9602
         days: 1         0.0000  0.0000   .  
         days: 2         0.0000  0.0000   .     .  
         days: 3         0.0000  0.0000   .     .     .  
         days: 4         0.0000  0.0000   .     .     .     .  
         days: 5       519.6046 22.7948   .     .     .     .     .  
         days: 6      1704.0493 41.2801   .     .     .     .     .     .  
         days: 7       608.7797 24.6735   .     .     .     .     .     .     .  
         days: 8      1273.0916 35.6804   .     .     .     .     .     .     .     .  
         days: 9      1754.0795 41.8817   .     .     .     .     .     .     .     .     .  
Residual               434.8566 20.8532
 Number of obs: 180; levels of grouping factors: 18

  Fixed-effects parameters:
───────────────────────────────────────────────────
                 Coef.  Std. Error      z  Pr(>|z|)
───────────────────────────────────────────────────
(Intercept)  256.652       8.79834  29.17    <1e-99
days: 1        7.84395     6.95107   1.13    0.2591
days: 2        8.71009     6.95107   1.25    0.2102
days: 3       26.3402      6.95107   3.79    0.0002
days: 4       31.9976      6.95107   4.60    <1e-05
days: 5       51.8666      8.78546   5.90    <1e-08
days: 6       55.5264     11.9577    4.64    <1e-05
days: 7       62.0988      9.06303   6.85    <1e-11
days: 8       79.9777     10.9108    7.33    <1e-12
days: 9       94.1994     12.0734    7.80    <1e-14
───────────────────────────────────────────────────
fit(MixedModel, @formula(reaction ~ 1 + days + (1|subj) + zerocorr(days|subj)), sleepstudy,
    contrasts = Dict(:days => DummyCoding()))
Linear mixed model fit by maximum likelihood
 reaction ~ 1 + days + (1 | subj) + zerocorr(days | subj)
   logLik   -2 logLik     AIC       AICc        BIC    
  -882.9138  1765.8276  1807.8276  1813.6757  1874.8797

Variance components:
            Column    Variance Std.Dev.  Corr.
subj     (Intercept)   958.5364 30.9602
         days: 1         0.0000  0.0000   .  
         days: 2         0.0000  0.0000   .     .  
         days: 3         0.0000  0.0000   .     .     .  
         days: 4         0.0000  0.0000   .     .     .     .  
         days: 5       519.6046 22.7948   .     .     .     .     .  
         days: 6      1704.0493 41.2801   .     .     .     .     .     .  
         days: 7       608.7797 24.6735   .     .     .     .     .     .     .  
         days: 8      1273.0916 35.6804   .     .     .     .     .     .     .     .  
         days: 9      1754.0795 41.8817   .     .     .     .     .     .     .     .     .  
Residual               434.8566 20.8532
 Number of obs: 180; levels of grouping factors: 18

  Fixed-effects parameters:
───────────────────────────────────────────────────
                 Coef.  Std. Error      z  Pr(>|z|)
───────────────────────────────────────────────────
(Intercept)  256.652       8.79834  29.17    <1e-99
days: 1        7.84395     6.95107   1.13    0.2591
days: 2        8.71009     6.95107   1.25    0.2102
days: 3       26.3402      6.95107   3.79    0.0002
days: 4       31.9976      6.95107   4.60    <1e-05
days: 5       51.8666      8.78546   5.90    <1e-08
days: 6       55.5264     11.9577    4.64    <1e-05
days: 7       62.0988      9.06303   6.85    <1e-11
days: 8       79.9777     10.9108    7.33    <1e-12
days: 9       94.1994     12.0734    7.80    <1e-14
───────────────────────────────────────────────────
fit(MixedModel, @formula(reaction ~ 1 + days + zerocorr(1 + fulldummy(days)|subj)), sleepstudy,
    contrasts = Dict(:days => DummyCoding()))
Linear mixed model fit by maximum likelihood
 reaction ~ 1 + days + zerocorr(1 + days | subj)
   logLik   -2 logLik     AIC       AICc        BIC    
  -878.9843  1757.9686  1801.9686  1808.4145  1872.2137

Variance components:
            Column     Variance   Std.Dev.   Corr.
subj     (Intercept)  1135.145912 33.691927
         days: 0       776.498722 27.865727   .  
         days: 1       358.222782 18.926774   .     .  
         days: 2       221.696296 14.889469   .     .     .  
         days: 3         0.601244  0.775399   .     .     .     .  
         days: 4        44.953181  6.704713   .     .     .     .     .  
         days: 5       670.793401 25.899680   .     .     .     .     .     .  
         days: 6      1740.364634 41.717678   .     .     .     .     .     .     .  
         days: 7       909.218702 30.153254   .     .     .     .     .     .     .     .  
         days: 8      1458.403125 38.189045   .     .     .     .     .     .     .     .     .  
         days: 9      2028.558005 45.039516   .     .     .     .     .     .     .     .     .     .  
Residual               180.399835 13.431301
 Number of obs: 180; levels of grouping factors: 18

  Fixed-effects parameters:
───────────────────────────────────────────────────
                 Coef.  Std. Error      z  Pr(>|z|)
───────────────────────────────────────────────────
(Intercept)  256.652      10.7808   23.81    <1e-99
days: 1        7.84395     9.11507   0.86    0.3895
days: 2        8.71009     8.68906   1.00    0.3161
days: 3       26.3402      7.95089   3.31    0.0009
days: 4       31.9976      8.10436   3.95    <1e-04
days: 5       51.8666     10.0225    5.18    <1e-06
days: 6       55.5265     12.644     4.39    <1e-04
days: 7       62.0988     10.6628    5.82    <1e-08
days: 8       79.9777     12.0086    6.66    <1e-10
days: 9       94.1994     13.262     7.10    <1e-11
───────────────────────────────────────────────────

Fitting generalized linear mixed models

To create a GLMM representation, the distribution family for the response, and possibly the link function, must be specified. You can either use fit(MixedModel, ...) or glmm(...) to fit the model. For instance:

verbagg = MixedModelsDatasets.dataset(:verbagg)
verbaggform = @formula(r2 ~ 1 + anger + gender + btype + situ + mode + (1|subj) + (1|item));
gm1 = fit(MixedModel, verbaggform, verbagg, Bernoulli())
Generalized Linear Mixed Model fit by maximum likelihood (nAGQ = 1)
  r2 ~ 1 + anger + gender + btype + situ + mode + (1 | subj) + (1 | item)
  Distribution: Bernoulli{Float64}
  Link: LogitLink()

   logLik    deviance     AIC       AICc        BIC    
 -4067.9164  8135.8329  8153.8329  8153.8566  8216.2370

Variance components:
        Column   Variance Std.Dev. 
subj (Intercept)  1.793397 1.339178
item (Intercept)  0.117135 0.342251

 Number of obs: 7584; levels of grouping factors: 316, 24

Fixed-effects parameters:
──────────────────────────────────────────────────────
                   Coef.  Std. Error       z  Pr(>|z|)
──────────────────────────────────────────────────────
(Intercept)   -0.153745    0.385211    -0.40    0.6898
anger          0.0574292   0.0167524    3.43    0.0006
gender: M      0.320535    0.191203     1.68    0.0937
btype: scold  -1.05969     0.184149    -5.75    <1e-08
btype: shout  -2.10385     0.186508   -11.28    <1e-28
situ: self    -1.05436     0.151187    -6.97    <1e-11
mode: want     0.707005    0.151        4.68    <1e-05
──────────────────────────────────────────────────────

The model can also be fit as

gm1 = glmm(verbaggform, verbagg, Bernoulli())
Est.SEzpσ_subjσ_item
(Intercept)-0.15370.3852-0.400.68981.33920.3423
anger0.05740.01683.430.0006
gender: M0.32050.19121.680.0937
btype: scold-1.05970.1841-5.75<1e-08
btype: shout-2.10390.1865-11.28<1e-28
situ: self-1.05440.1512-6.97<1e-11
mode: want0.70700.15104.68<1e-05

The canonical link, which is LogitLink for the Bernoulli distribution, is used if no explicit link is specified.

Note that, in keeping with convention in the GLM package, the distribution family for a binary (i.e. 0/1) response is the Bernoulli distribution. The Binomial distribution is only used when the response is the fraction of trials returning a positive, in which case the number of trials must be specified as the case weights.

Optional arguments to fit

An alternative approach is to create the GeneralizedLinearMixedModel object then call fit! on it. The optional arguments fast and/or nAGQ can be passed to the optimization process via both fit and fit! (i.e these optimization settings are not used nor recognized when constructing the model).

As the name implies, fast=true, provides a faster but somewhat less accurate fit. These fits may suffice for model comparisons.

gm1a = fit(MixedModel, verbaggform, verbagg, Bernoulli(), fast = true)
deviance(gm1a) - deviance(gm1)
0.3380123026227011
@benchmark fit(MixedModel, $verbaggform, $verbagg, Bernoulli())
BenchmarkTools.Trial: 2 samples with 1 evaluation per sample.
 Range (minmax):  4.043 s 4.055 s   GC (min … max): 0.00% … 0.09%
 Time  (median):     4.049 s              GC (median):    0.04%
 Time  (mean ± σ):   4.049 s ± 8.204 ms   GC (mean ± σ):  0.04% ± 0.06%

                              ▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█ ▁
  4.04 s        Histogram: frequency by time        4.05 s <

 Memory estimate: 64.69 MiB, allocs estimate: 655536.
@benchmark fit(MixedModel, $verbaggform, $verbagg, Bernoulli(), fast = true)
BenchmarkTools.Trial: 33 samples with 1 evaluation per sample.
 Range (minmax):  153.781 ms160.515 ms   GC (min … max): 0.00% … 1.49%
 Time  (median):     155.001 ms                GC (median):    0.00%
 Time  (mean ± σ):   155.914 ms ±   2.048 ms   GC (mean ± σ):  0.13% ± 0.41%

     █▃█  ▃  ▃   ▃                                               
  ▇▁▇███▇▇█▇█▁▁▇█▁▇▁▇▁▁▇▇▁▁▁▁▁▁▁▁▁▁▇▇▁▁▁▁▁▁▁▁▁▁▁▇▇▇▁▁▁▇▁▇▁▁▁▇ ▁
  154 ms           Histogram: frequency by time          161 ms <

 Memory estimate: 10.73 MiB, allocs estimate: 77627.

The optional argument nAGQ=k causes evaluation of the deviance function to use a k point adaptive Gauss-Hermite quadrature rule. This method only applies to models with a single, simple, scalar random-effects term, such as

contraception = MixedModelsDatasets.dataset(:contra)
contraform = @formula(use ~ 1 + age + abs2(age) + livch + urban + (1|dist));
bernoulli = Bernoulli()
deviances = Dict{Symbol,Float64}()
b = @benchmarkable deviances[:default] = deviance(fit(MixedModel, $contraform, $contraception, $bernoulli));
run(b)
b = @benchmarkable deviances[:fast] = deviance(fit(MixedModel, $contraform, $contraception, $bernoulli, fast = true));
run(b)
b = @benchmarkable deviances[:nAGQ] = deviance(fit(MixedModel, $contraform, $contraception, $bernoulli, nAGQ=9));
run(b)
b = @benchmarkable deviances[:nAGQ_fast] = deviance(fit(MixedModel, $contraform, $contraception, $bernoulli, nAGQ=9, fast=true));
run(b)
sort(deviances)
OrderedCollections.OrderedDict{Symbol, Float64} with 4 entries:
  :default   => 2372.73
  :fast      => 2372.78
  :nAGQ      => 2372.46
  :nAGQ_fast => 2372.51

Extractor functions

LinearMixedModel and GeneralizedLinearMixedModel are subtypes of StatsAPI.RegressionModel which, in turn, is a subtype of StatsBase.StatisticalModel. Many of the generic extractors defined in the StatsBase package have methods for these models.

Model-fit statistics

The statistics describing the quality of the model fit include

StatsAPI.loglikelihoodFunction
loglikelihood(model::StatisticalModel)
loglikelihood(model::StatisticalModel, observation)

Return the log-likelihood of the model.

With an observation argument, return the contribution of observation to the log-likelihood of model.

If observation is a Colon, return a vector of each observation's contribution to the log-likelihood of the model. In other words, this is the vector of the pointwise log-likelihood contributions.

In general, sum(loglikehood(model, :)) == loglikelihood(model).

source
StatsAPI.aicFunction
aic(model::StatisticalModel)

Akaike's Information Criterion, defined as $-2 \log L + 2k$, with $L$ the likelihood of the model, and k its number of consumed degrees of freedom (as returned by dof).

source
StatsAPI.bicFunction
bic(model::StatisticalModel)

Bayesian Information Criterion, defined as $-2 \log L + k \log n$, with $L$ the likelihood of the model, $k$ its number of consumed degrees of freedom (as returned by dof), and $n$ the number of observations (as returned by nobs).

source
StatsAPI.dofFunction
dof(model::StatisticalModel)

Return the number of degrees of freedom consumed in the model, including when applicable the intercept and the distribution's dispersion parameter.

source
StatsAPI.nobsFunction
nobs(model::StatisticalModel)

Return the number of independent observations on which the model was fitted. Be careful when using this information, as the definition of an independent observation may vary depending on the model, on the format used to pass the data, on the sampling plan (if specified), etc.

source
loglikelihood(fm1)
-163.66352994056336
aic(fm1)
333.32705988112673
bic(fm1)
337.5306520261132
dof(fm1)   # 1 fixed effect, 2 variances
3
nobs(fm1)  # 30 observations
30
loglikelihood(gm1)
-4067.916429701673

In general the deviance of a statistical model fit is negative twice the log-likelihood adjusting for the saturated model.

StatsAPI.devianceMethod
deviance(m::GeneralizedLinearMixedModel{T}, nAGQ=1)::T where {T}

Return the deviance of m evaluated by the Laplace approximation (nAGQ=1) or nAGQ-point adaptive Gauss-Hermite quadrature.

If the distribution D does not have a scale parameter the Laplace approximation is the squared length of the conditional modes, $u$, plus the determinant of $Λ'Z'WZΛ + I$, plus the sum of the squared deviance residuals.

source

Because it is not clear what the saturated model corresponding to a particular LinearMixedModel should be, negative twice the log-likelihood is called the objective.

This value is also accessible as the deviance but the user should bear in mind that this doesn't have all the properties of a deviance which is corrected for the saturated model. For example, it is not necessarily non-negative.

objective(fm1)
327.32705988112673
deviance(fm1)
327.32705988112673

The value optimized when fitting a GeneralizedLinearMixedModel is the Laplace approximation to the deviance or an adaptive Gauss-Hermite evaluation.

MixedModels.deviance!Function
deviance!(m::GeneralizedLinearMixedModel, nAGQ=1)

Update m.η, m.μ, etc., install the working response and working weights in m.LMM, update m.LMM.A and m.LMM.R, then evaluate the deviance.

source
MixedModels.deviance!(gm1)
8135.832859403364

Fixed-effects parameter estimates

The coef and fixef extractors both return the maximum likelihood estimates of the fixed-effects coefficients. They differ in their behavior in the rank-deficient case. The associated coefnames and fixefnames return the corresponding coefficient names.

StatsAPI.coefFunction
coef(model::StatisticalModel)

Return the coefficients of the model.

source
MixedModels.fixefFunction
fixef(m::MixedModel)

Return the fixed-effects parameter vector estimate of m.

In the rank-deficient case the truncated parameter vector, of length rank(m) is returned. This is unlike coef which always returns a vector whose length matches the number of columns in X.

source
MixedModels.fixefnamesFunction
fixefnames(m::MixedModel)

Return a (permuted and truncated in the rank-deficient case) vector of coefficient names.

source
coef(fm1)
coefnames(fm1)
1-element Vector{String}:
 "(Intercept)"
fixef(fm1)
fixefnames(fm1)
1-element Vector{String}:
 "(Intercept)"

An alternative extractor for the fixed-effects coefficient is the β property. Properties whose names are Greek letters usually have an alternative spelling, which is the name of the Greek letter.

fm1.β
1-element Vector{Float64}:
 1527.5000000000011
fm1.beta
1-element Vector{Float64}:
 1527.5000000000011
gm1.β
7-element Vector{Float64}:
 -0.15374521706254418
  0.057429184648104196
  0.32053477387095086
 -1.059693575252348
 -2.103852401715679
 -1.0543587111038337
  0.7070045757873128

A full list of property names is returned by propertynames

propertynames(fm1)
(:formula, :reterms, :Xymat, :feterm, :sqrtwts, :parmap, :dims, :A, :L, :optsum, :θ, :theta, :β, :beta, :βs, :betas, :λ, :lambda, :stderror, :σ, :sigma, :σs, :sigmas, :σρs, :sigmarhos, :b, :u, :X, :y, :corr, :vcov, :PCA, :rePCA, :objective, :pvalues)
propertynames(gm1)
(:A, :L, :theta, :beta, :coef, :λ, :σ, :sigma, :X, :y, :objective, :σρs, :σs, :corr, :vcov, :PCA, :rePCA, :LMM, :β, :θ, :b, :u, :resp, :wt)

The variance-covariance matrix of the fixed-effects coefficients is returned by

StatsAPI.vcovFunction
vcov(model::StatisticalModel)

Return the variance-covariance matrix for the coefficients of the model.

source
vcov(fm2)
2×2 Matrix{Float64}:
 43.9874   -1.37052
 -1.37052   2.25673
vcov(gm1)
7×7 Matrix{Float64}:
  0.148388    -0.00561464   -0.00982298   …  -0.0112061    -0.0113458
 -0.00561464   0.000280643   7.19086e-5      -1.47973e-5    1.02416e-5
 -0.00982298   7.19086e-5    0.0365586       -8.04336e-5    5.25819e-5
 -0.0167971   -1.43715e-5   -9.25519e-5       0.000265793  -0.000172092
 -0.0166211   -2.90571e-5   -0.000162372      0.00065896   -0.00052052
 -0.0112061   -1.47973e-5   -8.04336e-5   …   0.0228574    -0.00024778
 -0.0113458    1.02416e-5    5.25819e-5      -0.00024778    0.0228009

The standard errors are the square roots of the diagonal elements of the estimated variance-covariance matrix of the fixed-effects coefficient estimators.

StatsAPI.stderrorFunction
stderror(model::StatisticalModel)

Return the standard errors for the coefficients of the model.

source
stderror(fm2)
2-element Vector{Float64}:
 6.632297942343554
 1.502240680664057
stderror(gm1)
7-element Vector{Float64}:
 0.38521128748945005
 0.01675241548930026
 0.19120290837085227
 0.18414923963837124
 0.18650807321698457
 0.15118668987587122
 0.15099961612858606

Finally, the coeftable generic produces a table of coefficient estimates, their standard errors, and their ratio. The p-values quoted here should be regarded as approximations.

StatsAPI.coeftableFunction
coeftable(model::StatisticalModel; level::Real=0.95)

Return a table with coefficients and related statistics of the model. level determines the level for confidence intervals (by default, 95%).

The returned CoefTable object implements the Tables.jl interface, and can be converted e.g. to a DataFrame via using DataFrames; DataFrame(coeftable(model)).

source
coeftable(fm2)
──────────────────────────────────────────────────
                Coef.  Std. Error      z  Pr(>|z|)
──────────────────────────────────────────────────
(Intercept)  251.405      6.6323   37.91    <1e-99
days          10.4673     1.50224   6.97    <1e-11
──────────────────────────────────────────────────

Covariance parameter estimates

The covariance parameters estimates, in the form shown in the model summary, are a VarCorr object

VarCorr(fm2)
Variance components:
            Column    Variance Std.Dev.   Corr.
subj     (Intercept)  565.52071 23.78068
         days          32.68242  5.71685 +0.08
Residual              654.94015 25.59180
VarCorr(gm1)
Variance components:
        Column   Variance Std.Dev. 
subj (Intercept)  1.793397 1.339178
item (Intercept)  0.117135 0.342251

Individual components are returned by other extractors

MixedModels.varestFunction
varest(m::LinearMixedModel)

Returns the estimate of σ², the variance of the conditional distribution of Y given B.

source
varest(m::GeneralizedLinearMixedModel)

Returns the estimate of ϕ², the variance of the conditional distribution of Y given B.

For models with a dispersion parameter ϕ, this is simply ϕ². For models without a dispersion parameter, this value is missing. This differs from disperion, which returns 1 for models without a dispersion parameter.

For Gaussian models, this parameter is often called σ².

source
MixedModels.sdestFunction
sdest(m::LinearMixedModel)

Return the estimate of σ, the standard deviation of the per-observation noise.

source
sdest(m::GeneralizedLinearMixedModel)

Return the estimate of the dispersion, i.e. the standard deviation of the per-observation noise.

For models with a dispersion parameter ϕ, this is simply ϕ. For models without a dispersion parameter, this value is missing. This differs from disperion, which returns 1 for models without a dispersion parameter.

For Gaussian models, this parameter is often called σ.

source
varest(fm2)
654.9401543542393
sdest(fm2)
25.59179857599382
fm2.σ
25.59179857599382

Conditional modes of the random effects

The ranef extractor

MixedModels.ranefFunction
ranef(m::LinearMixedModel; uscale=false)

Return, as a Vector{Matrix{T}}, the conditional modes of the random effects in model m.

If uscale is true the random effects are on the spherical (i.e. u) scale, otherwise on the original scale.

For a named variant, see raneftables.

source
ranef(fm1)
1-element Vector{Matrix{Float64}}:
 [-16.628221179075137 0.36951602620083934 … 53.57982379923807 -42.494343013190274]
fm1.b
1-element Vector{Matrix{Float64}}:
 [-16.628221179075137 0.36951602620083934 … 53.57982379923807 -42.494343013190274]

returns the conditional modes of the random effects given the observed data. That is, these are the values that maximize the conditional density of the random effects given the observed data. For a LinearMixedModel these are also the conditional means.

These are sometimes called the best linear unbiased predictors or BLUPs but that name is not particularly meaningful.

At a superficial level these can be considered as the "estimates" of the random effects, with a bit of hand waving, but pursuing this analogy too far usually results in confusion.

To obtain tables associating the values of the conditional modes with the levels of the grouping factor, use

MixedModels.raneftablesFunction
raneftables(m::MixedModel; uscale = false)

Return the conditional means of the random effects as a NamedTuple of Tables.jl-compliant tables.

Note

The API guarantee is only that the NamedTuple contains Tables.jl tables and not on the particular concrete type of each table.

source

as in

DataFrame(only(raneftables(fm1)))
6×2 DataFrame
Rowbatch(Intercept)
StringFloat64
1A-16.6282
2B0.369516
3C26.9747
4D-21.8014
5E53.5798
6F-42.4943

The corresponding conditional variances are returned by

MixedModels.condVarFunction
condVar(m::LinearMixedModel)

Return the conditional variances matrices of the random effects.

The random effects are returned by ranef as a vector of length k, where k is the number of random effects terms. The ith element is a matrix of size vᵢ × ℓᵢ where vᵢ is the size of the vector-valued random effects for each of the ℓᵢ levels of the grouping factor. Technically those values are the modes of the conditional distribution of the random effects given the observed data.

This function returns an array of k three dimensional arrays, where the ith array is of size vᵢ × vᵢ × ℓᵢ. These are the diagonal blocks from the conditional variance-covariance matrix,

s² Λ(Λ'Z'ZΛ + I)⁻¹Λ'
source
condVar(fm1)
1-element Vector{Array{Float64, 3}}:
 [362.3104691431006;;; 362.3104691431006;;; 362.3104691431006;;; 362.3104691431006;;; 362.3104691431006;;; 362.3104691431006]

Case-wise diagnostics and residual degrees of freedom

The leverage values

StatsAPI.leverageFunction
leverage(model::RegressionModel)

Return the diagonal of the projection matrix of the model.

source
leverage(fm1)
30-element Vector{Float64}:
 0.1565053420672158
 0.1565053420672158
 0.1565053420672158
 0.1565053420672158
 0.1565053420672158
 0.1565053420672158
 0.1565053420672158
 0.1565053420672158
 0.1565053420672158
 0.1565053420672158
 ⋮
 0.1565053420672158
 0.1565053420672158
 0.1565053420672158
 0.1565053420672158
 0.1565053420672158
 0.1565053420672158
 0.1565053420672158
 0.1565053420672158
 0.1565053420672158

are used in diagnostics for linear regression models to determine cases that exert a strong influence on their own predicted response.

The documentation refers to a "projection". For a linear model without random effects the fitted values are obtained by orthogonal projection of the response onto the column span of the model matrix and the sum of the leverage values is the dimension of this column span. That is, the sum of the leverage values is the rank of the model matrix and n - sum(leverage(m)) is the degrees of freedom for residuals. The sum of the leverage values is also the trace of the so-called "hat" matrix, H. (The name "hat matrix" reflects the fact that $\hat{\mathbf{y}} = \mathbf{H} \mathbf{y}$. That is, H puts a hat on y.)

For a linear mixed model the sum of the leverage values will be between p, the rank of the fixed-effects model matrix, and p + q where q is the total number of random effects. This number does not represent a dimension (or "degrees of freedom") of a linear subspace of all possible fitted values because the projection is not an orthogonal projection. Nevertheless, it is a reasonable measure of the effective degrees of freedom of the model and n - sum(leverage(m)) can be considered the effective residual degrees of freedom.

For model fm1 the dimensions are

n, p, q, k = size(fm1)
(30, 1, 6, 1)

which implies that the sum of the leverage values should be in the range [1, 7]. The actual value is

sum(leverage(fm1))
4.695160262016475

For model fm2 the dimensions are

n, p, q, k = size(fm2)
(180, 2, 36, 1)

providing a range of [2, 38] for the effective degrees of freedom for the model. The observed value is

sum(leverage(fm2))
28.61167027736884

When a model converges to a singular covariance, such as

fm3 = fit(MixedModel, @formula(yield ~ 1+(1|batch)), MixedModelsDatasets.dataset(:dyestuff2))
Linear mixed model fit by maximum likelihood
 yield ~ 1 + (1 | batch)
   logLik   -2 logLik     AIC       AICc        BIC    
   -81.4365   162.8730   168.8730   169.7961   173.0766

Variance components:
            Column   Variance Std.Dev.
batch    (Intercept)   0.00000 0.00000
Residual              13.34610 3.65323
 Number of obs: 30; levels of grouping factors: 6

  Fixed-effects parameters:
───────────────────────────────────────────────
              Coef.  Std. Error     z  Pr(>|z|)
───────────────────────────────────────────────
(Intercept)  5.6656    0.666986  8.49    <1e-16
───────────────────────────────────────────────

the effective degrees of freedom is the lower bound.

sum(leverage(fm3))
0.9999999999999998

Models for which the estimates of the variances of the random effects are large relative to the residual variance have effective degrees of freedom close to the upper bound.

fm4 = fit(MixedModel, @formula(diameter ~ 1+(1|plate)+(1|sample)),
    MixedModelsDatasets.dataset(:penicillin))
Linear mixed model fit by maximum likelihood
 diameter ~ 1 + (1 | plate) + (1 | sample)
   logLik   -2 logLik     AIC       AICc        BIC    
  -166.0942   332.1883   340.1883   340.4761   352.0676

Variance components:
            Column   Variance Std.Dev. 
plate    (Intercept)  0.714993 0.845572
sample   (Intercept)  3.135139 1.770632
Residual              0.302426 0.549932
 Number of obs: 144; levels of grouping factors: 24, 6

  Fixed-effects parameters:
─────────────────────────────────────────────────
               Coef.  Std. Error      z  Pr(>|z|)
─────────────────────────────────────────────────
(Intercept)  22.9722     0.74459  30.85    <1e-99
─────────────────────────────────────────────────
sum(leverage(fm4))
27.465347288682793

Also, a model fit by the REML criterion generally has larger estimates of the variance components and hence a larger effective degrees of freedom.

fm4r = fit(MixedModel, @formula(diameter ~ 1+(1|plate)+(1|sample)),
    MixedModelsDatasets.dataset(:penicillin), REML=true)
Linear mixed model fit by REML
 diameter ~ 1 + (1 | plate) + (1 | sample)
 REML criterion at convergence: 330.8605889910177

Variance components:
            Column   Variance Std.Dev. 
plate    (Intercept)  0.716908 0.846704
sample   (Intercept)  3.730916 1.931558
Residual              0.302415 0.549923
 Number of obs: 144; levels of grouping factors: 24, 6

  Fixed-effects parameters:
─────────────────────────────────────────────────
               Coef.  Std. Error      z  Pr(>|z|)
─────────────────────────────────────────────────
(Intercept)  22.9722    0.808573  28.41    <1e-99
─────────────────────────────────────────────────
sum(leverage(fm4r))
27.472361993617362