Model constructors
The LinearMixedModel
type represents a linear mixed-effects model. Typically it is constructed from a Formula
and an appropriate Table
type, usually a DataFrame
.
Examples of linear mixed-effects model fits
For illustration, several data sets from the lme4 package for R are made available in .arrow
format in this package. Often, for convenience, we will convert these to DataFrame
s. These data sets include the dyestuff
and dyestuff2
data sets.
using DataFrames, MixedModels, StatsModels
dyestuff = MixedModels.dataset(:dyestuff)
Arrow.Table with 30 rows, 2 columns, and schema:
:batch String
:yield Int16
describe(DataFrame(dyestuff))
2 rows × 7 columns
variable | mean | min | median | max | nmissing | eltype | |
---|---|---|---|---|---|---|---|
Symbol | Union… | Any | Union… | Any | Int64 | DataType | |
1 | batch | A | F | 0 | String | ||
2 | yield | 1527.5 | 1440 | 1530.0 | 1635 | 0 | Int16 |
The @formula
language in Julia
MixedModels.jl builds on the the Julia formula language provided by StatsModels.jl, which is similar to the formula language in R and is also based on the notation from Wilkinson and Rogers (1973). There are two ways to construct a formula in Julia. The first way is to enclose the formula expression in the @formula
macro:
StatsModels.@formula
— Macro@formula(ex)
Capture and parse a formula expression as a Formula
struct.
A formula is an abstract specification of a dependence between left-hand and right-hand side variables as in, e.g., a regression model. Each side specifies at a high level how tabular data is to be converted to a numerical matrix suitable for modeling. This specification looks something like Julia code, is represented as a Julia Expr
, but uses special syntax. The @formula
macro takes an expression like y ~ 1 + a*b
, transforms it according to the formula syntax rules into a lowered form (like y ~ 1 + a + b + a&b
), and constructs a Formula
struct which captures the original expression, the lowered expression, and the left- and right-hand-side.
Operators that have special interpretations in this syntax are
~
is the formula separator, where it is a binary operator (the first argument is the left-hand side, and the second is the right-hand side.+
concatenates variables as columns when generating a model matrix.&
representes an interaction between two or more variables, which corresponds to a row-wise kronecker product of the individual terms (or element-wise product if all terms involved are continuous/scalar).*
expands to all main effects and interactions:a*b
is equivalent toa+b+a&b
,a*b*c
toa+b+c+a&b+a&c+b&c+a&b&c
, etc.1
,0
, and-1
indicate the presence (for1
) or absence (for0
and-1
) of an intercept column.
The rules that are applied are
- The associative rule (un-nests nested calls to
+
,&
, and*
). - The distributive rule (interactions
&
distribute over concatenation+
). - The
*
rule expandsa*b
toa+b+a&b
(recursively). - Subtraction is converted to addition and negation, so
x-1
becomesx + -1
(applies only to subtraction of literal 1). - Single-argument
&
calls are stripped, so&(x)
becomes the main effectx
.
The second way is to combine Term
s with operators like +
, &
, ~
, and others at "run time". This is especially useful if you wish to create a formula from a list a variable names. For instance, the following are equivalent:
@formula(y ~ 1 + a + b + a & b) == (term(:y) ~ term(1) + term(:a) + term(:b) + term(:a) & term(:b))
true
MixedModels.jl provides additional formula syntax for representing random-effects terms. Most importantly, |
separates random effects and their grouping factors (as in the formula extension used by the R package lme4
. Much like with the base formula language, |
can be used within the @formula
macro and to construct a formula programmatically:
@formula(y ~ 1 + a + b + (1 + a + b | g))
FormulaTerm
Response:
y(unknown)
Predictors:
1
a(unknown)
b(unknown)
(a,b,g)->(1 + a + b) | g
terms = sum(term(t) for t in [1, :a, :b])
group = term(:g)
response = term(:y)
response ~ terms + (terms | group)
FormulaTerm
Response:
y(unknown)
Predictors:
1
a(unknown)
b(unknown)
(1 + a + b | g)
Models with simple, scalar random effects
A basic model with simple, scalar random effects for the levels of batch
(the batch of an intermediate product, in this case) is declared and fit as
fm = @formula(yield ~ 1 + (1|batch))
fm1 = fit(MixedModel, fm, dyestuff)
Linear mixed model fit by maximum likelihood
yield ~ 1 + (1 | batch)
logLik -2 logLik AIC AICc BIC
-163.6635 327.3271 333.3271 334.2501 337.5307
Variance components:
Column Variance Std.Dev.
batch (Intercept) 1388.3334 37.2603
Residual 2451.2500 49.5101
Number of obs: 30; levels of grouping factors: 6
Fixed-effects parameters:
────────────────────────────────────────────────
Coef. Std. Error z Pr(>|z|)
────────────────────────────────────────────────
(Intercept) 1527.5 17.6946 86.33 <1e-99
────────────────────────────────────────────────
(If you are new to Julia you may find that this first fit takes an unexpectedly long time, due to Just-In-Time (JIT) compilation of the code. The subsequent calls to such functions are much faster.)
using BenchmarkTools
dyestuff2 = MixedModels.dataset(:dyestuff2)
@benchmark fit(MixedModel, $fm, $dyestuff2)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
Range (min … max): 134.701 μs … 39.253 ms ┊ GC (min … max): 0.00% … 95.34%
Time (median): 142.400 μs ┊ GC (median): 0.00%
Time (mean ± σ): 159.537 μs ± 742.646 μs ┊ GC (mean ± σ): 8.96% ± 1.92%
▂▃▄▆▆█▄▂ ▁
▁▁▁▂▂▃▄▆██████████▆▅▅▅▄▄▄▄▅▄▃▃▃▃▃▂▃▂▂▂▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▃
135 μs Histogram: frequency by time 167 μs <
Memory estimate: 54.19 KiB, allocs estimate: 981.
By default, the model is fit by maximum likelihood. To use the REML
criterion instead, add the optional named argument REML=true
to the call to fit
fm1reml = fit(MixedModel, fm, dyestuff, REML=true)
Linear mixed model fit by REML
yield ~ 1 + (1 | batch)
REML criterion at convergence: 319.6542768422544
Variance components:
Column Variance Std.Dev.
batch (Intercept) 1764.0517 42.0006
Residual 2451.2496 49.5101
Number of obs: 30; levels of grouping factors: 6
Fixed-effects parameters:
────────────────────────────────────────────────
Coef. Std. Error z Pr(>|z|)
────────────────────────────────────────────────
(Intercept) 1527.5 19.3834 78.80 <1e-99
────────────────────────────────────────────────
Floating-point type in the model
The type of fm1
typeof(fm1)
LinearMixedModel{Float64}
includes the floating point type used internally for the various matrices, vectors, and scalars that represent the model. At present, this will always be Float64
because the parameter estimates are optimized using the NLopt
package which calls compiled C code that only allows for optimization with respect to a Float64
parameter vector.
So in theory other floating point types, such as BigFloat
or Float32
, can be used to define a model but in practice only Float64
works at present.
In theory, theory and practice are the same. In practice, they aren't. – Anon
Simple, scalar random effects
A simple, scalar random effects term in a mixed-effects model formula is of the form (1|G)
. All random effects terms end with |G
where G
is the grouping factor for the random effect. The name or, more generally the expression, G
, should evaluate to a categorical array that has a distinct set of levels. The random effects are associated with the levels of the grouping factor.
A scalar random effect is, as the name implies, one scalar value for each level of the grouping factor. A simple, scalar random effects term is of the form, (1|G)
. It corresponds to a shift in the intercept for each level of the grouping factor.
Models with vector-valued random effects
The sleepstudy data are observations of reaction time, reaction
, on several subjects, subj
, after 0 to 9 days of sleep deprivation, days
. A model with random intercepts and random slopes for each subject, allowing for within-subject correlation of the slope and intercept, is fit as
sleepstudy = MixedModels.dataset(:sleepstudy)
fm2 = fit(MixedModel, @formula(reaction ~ 1 + days + (1 + days|subj)), sleepstudy)
Linear mixed model fit by maximum likelihood
reaction ~ 1 + days + (1 + days | subj)
logLik -2 logLik AIC AICc BIC
-875.9697 1751.9393 1763.9393 1764.4249 1783.0971
Variance components:
Column Variance Std.Dev. Corr.
subj (Intercept) 565.51068 23.78047
days 32.68212 5.71683 +0.08
Residual 654.94145 25.59182
Number of obs: 180; levels of grouping factors: 18
Fixed-effects parameters:
──────────────────────────────────────────────────
Coef. Std. Error z Pr(>|z|)
──────────────────────────────────────────────────
(Intercept) 251.405 6.63226 37.91 <1e-99
days 10.4673 1.50224 6.97 <1e-11
──────────────────────────────────────────────────
Models with multiple, scalar random-effects terms
A model for the Penicillin data incorporates random effects for the plate, and for the sample. As every sample is used on every plate these two factors are crossed.
penicillin = MixedModels.dataset(:penicillin)
fm3 = fit(MixedModel, @formula(diameter ~ 1 + (1|plate) + (1|sample)), penicillin)
Linear mixed model fit by maximum likelihood
diameter ~ 1 + (1 | plate) + (1 | sample)
logLik -2 logLik AIC AICc BIC
-166.0942 332.1883 340.1883 340.4761 352.0676
Variance components:
Column Variance Std.Dev.
plate (Intercept) 0.714979 0.845565
sample (Intercept) 3.135193 1.770648
Residual 0.302426 0.549933
Number of obs: 144; levels of grouping factors: 24, 6
Fixed-effects parameters:
─────────────────────────────────────────────────
Coef. Std. Error z Pr(>|z|)
─────────────────────────────────────────────────
(Intercept) 22.9722 0.744596 30.85 <1e-99
─────────────────────────────────────────────────
In contrast, the cask
grouping factor is nested within the batch
grouping factor in the Pastes data.
pastes = DataFrame(MixedModels.dataset(:pastes))
describe(pastes)
3 rows × 7 columns
variable | mean | min | median | max | nmissing | eltype | |
---|---|---|---|---|---|---|---|
Symbol | Union… | Any | Union… | Any | Int64 | DataType | |
1 | batch | A | J | 0 | String | ||
2 | cask | a | c | 0 | String | ||
3 | strength | 60.0533 | 54.2 | 59.3 | 66.0 | 0 | Float64 |
This can be expressed using the solidus (the "/
" character) to separate grouping factors, read "cask
nested within batch
":
fm4a = fit(MixedModel, @formula(strength ~ 1 + (1|batch/cask)), pastes)
Linear mixed model fit by maximum likelihood
strength ~ 1 + (1 | batch) + (1 | batch & cask)
logLik -2 logLik AIC AICc BIC
-123.9972 247.9945 255.9945 256.7217 264.3718
Variance components:
Column Variance Std.Dev.
batch & cask (Intercept) 8.433617 2.904069
batch (Intercept) 1.199178 1.095070
Residual 0.678002 0.823409
Number of obs: 60; levels of grouping factors: 30, 10
Fixed-effects parameters:
─────────────────────────────────────────────────
Coef. Std. Error z Pr(>|z|)
─────────────────────────────────────────────────
(Intercept) 60.0533 0.642136 93.52 <1e-99
─────────────────────────────────────────────────
If the levels of the inner grouping factor are unique across the levels of the outer grouping factor, then this nesting does not need to expressed explicitly in the model syntax. For example, defining sample
to be the combination of batch
and cask
, yields a naming scheme where the nesting is apparent from the data even if not expressed in the formula. (That is, each level of sample
occurs in conjunction with only one level of batch
.) As such, this model is equivalent to the previous one.
pastes.sample = (string.(pastes.cask, "&", pastes.batch))
fm4b = fit(MixedModel, @formula(strength ~ 1 + (1|sample) + (1|batch)), pastes)
Linear mixed model fit by maximum likelihood
strength ~ 1 + (1 | sample) + (1 | batch)
logLik -2 logLik AIC AICc BIC
-123.9972 247.9945 255.9945 256.7217 264.3718
Variance components:
Column Variance Std.Dev.
sample (Intercept) 8.433617 2.904069
batch (Intercept) 1.199179 1.095070
Residual 0.678002 0.823409
Number of obs: 60; levels of grouping factors: 30, 10
Fixed-effects parameters:
─────────────────────────────────────────────────
Coef. Std. Error z Pr(>|z|)
─────────────────────────────────────────────────
(Intercept) 60.0533 0.642136 93.52 <1e-99
─────────────────────────────────────────────────
In observational studies it is common to encounter partially crossed grouping factors. For example, the InstEval data are course evaluations by students, s
, of instructors, d
. Additional covariates include the academic department, dept
, in which the course was given and service
, whether or not it was a service course.
insteval = MixedModels.dataset(:insteval)
fm5 = fit(MixedModel, @formula(y ~ 1 + service * dept + (1|s) + (1|d)), insteval)
Linear mixed model fit by maximum likelihood
y ~ 1 + service + dept + service & dept + (1 | s) + (1 | d)
logLik -2 logLik AIC AICc BIC
-118792.7767 237585.5534 237647.5534 237647.5804 237932.8763
Variance components:
Column Variance Std.Dev.
s (Intercept) 0.105418 0.324681
d (Intercept) 0.258416 0.508347
Residual 1.384728 1.176745
Number of obs: 73421; levels of grouping factors: 2972, 1128
Fixed-effects parameters:
────────────────────────────────────────────────────────────────
Coef. Std. Error z Pr(>|z|)
────────────────────────────────────────────────────────────────
(Intercept) 3.27628 0.0793647 41.28 <1e-99
service: Y 0.0116044 0.0699321 0.17 0.8682
dept: D02 -0.0411091 0.120331 -0.34 0.7326
dept: D03 0.00967415 0.108411 0.09 0.9289
dept: D04 0.105017 0.0944964 1.11 0.2664
dept: D05 0.0828643 0.11148 0.74 0.4573
dept: D06 -0.01194 0.0978342 -0.12 0.9029
dept: D07 0.0992679 0.110598 0.90 0.3694
dept: D08 0.0575337 0.127935 0.45 0.6529
dept: D09 -0.00263179 0.107085 -0.02 0.9804
dept: D10 -0.223423 0.099838 -2.24 0.0252
dept: D11 0.0129817 0.110639 0.12 0.9066
dept: D12 0.00503827 0.0944243 0.05 0.9574
dept: D14 0.00508271 0.109041 0.05 0.9628
dept: D15 -0.0466719 0.101942 -0.46 0.6471
service: Y & dept: D02 -0.144352 0.0929527 -1.55 0.1204
service: Y & dept: D03 0.0174078 0.0927237 0.19 0.8511
service: Y & dept: D04 -0.0381263 0.0810901 -0.47 0.6382
service: Y & dept: D05 0.0596632 0.123952 0.48 0.6303
service: Y & dept: D06 -0.254044 0.080781 -3.14 0.0017
service: Y & dept: D07 -0.151634 0.11157 -1.36 0.1741
service: Y & dept: D08 0.0508942 0.112189 0.45 0.6501
service: Y & dept: D09 -0.259448 0.0899448 -2.88 0.0039
service: Y & dept: D10 0.25907 0.111369 2.33 0.0200
service: Y & dept: D11 -0.276577 0.0819621 -3.37 0.0007
service: Y & dept: D12 -0.041849 0.0792928 -0.53 0.5977
service: Y & dept: D14 -0.256742 0.0931016 -2.76 0.0058
service: Y & dept: D15 0.24042 0.0982071 2.45 0.0144
────────────────────────────────────────────────────────────────
Simplifying the random effect correlation structure
MixedModels.jl estimates not only the variance of the effects for each random effect level, but also the correlation between the random effects for different predictors. So, for the model of the sleepstudy data above, one of the parameters that is estimated is the correlation between each subject's random intercept (i.e., their baseline reaction time) and slope (i.e., their particular change in reaction time per day of sleep deprivation). In some cases, you may wish to simplify the random effects structure by removing these correlation parameters. This often arises when there are many random effects you want to estimate (as is common in psychological experiments with many conditions and covariates), since the number of random effects parameters increases as the square of the number of predictors, making these models difficult to estimate from limited data.
The special syntax zerocorr
can be applied to individual random effects terms inside the @formula
:
fm2zerocorr_fm = fit(MixedModel, @formula(reaction ~ 1 + days + zerocorr(1 + days|subj)), sleepstudy)
Linear mixed model fit by maximum likelihood
reaction ~ 1 + days + MixedModels.ZeroCorr((1 + days | subj))
logLik -2 logLik AIC AICc BIC
-876.0016 1752.0033 1762.0033 1762.3481 1777.9680
Variance components:
Column Variance Std.Dev. Corr.
subj (Intercept) 584.25897 24.17145
days 33.63281 5.79938 .
Residual 653.11578 25.55613
Number of obs: 180; levels of grouping factors: 18
Fixed-effects parameters:
──────────────────────────────────────────────────
Coef. Std. Error z Pr(>|z|)
──────────────────────────────────────────────────
(Intercept) 251.405 6.70771 37.48 <1e-99
days 10.4673 1.51931 6.89 <1e-11
──────────────────────────────────────────────────
Alternatively, correlations between parameters can be removed by including them as separate random effects terms:
fit(MixedModel, @formula(reaction ~ 1 + days + (1|subj) + (days|subj)), sleepstudy)
Linear mixed model fit by maximum likelihood
reaction ~ 1 + days + (1 | subj) + (days | subj)
logLik -2 logLik AIC AICc BIC
-876.0016 1752.0033 1762.0033 1762.3481 1777.9680
Variance components:
Column Variance Std.Dev. Corr.
subj (Intercept) 584.25897 24.17145
days 33.63281 5.79938 .
Residual 653.11578 25.55613
Number of obs: 180; levels of grouping factors: 18
Fixed-effects parameters:
──────────────────────────────────────────────────
Coef. Std. Error z Pr(>|z|)
──────────────────────────────────────────────────
(Intercept) 251.405 6.70771 37.48 <1e-99
days 10.4673 1.51931 6.89 <1e-11
──────────────────────────────────────────────────
Finally, for predictors that are categorical, MixedModels.jl will estimate correlations between each level. Notice the large number of correlation parameters if we treat days
as a categorical variable by giving it contrasts:
fit(MixedModel, @formula(reaction ~ 1 + days + (1 + days|subj)), sleepstudy,
contrasts = Dict(:days => DummyCoding()))
Linear mixed model fit by maximum likelihood
reaction ~ 1 + days + (1 + days | subj)
logLik -2 logLik AIC AICc BIC
-805.9946 1611.9892 1743.9892 1822.2547 1954.7244
Variance components:
Column Variance Std.Dev. Corr.
subj (Intercept) 953.78376 30.88339
days: 1 493.86950 22.22317 -0.30
days: 2 912.25613 30.20358 -0.57 +0.75
days: 3 1265.62293 35.57559 -0.37 +0.72 +0.87
days: 4 1481.70171 38.49288 -0.32 +0.59 +0.67 +0.91
days: 5 2293.20330 47.88740 -0.25 +0.46 +0.46 +0.70 +0.85
days: 6 3850.57443 62.05300 -0.28 +0.30 +0.48 +0.70 +0.77 +0.75
days: 7 1806.32775 42.50091 -0.16 +0.22 +0.47 +0.50 +0.63 +0.64 +0.71
days: 8 3150.92375 56.13309 -0.20 +0.28 +0.37 +0.56 +0.73 +0.90 +0.73 +0.74
days: 9 3072.05197 55.42609 +0.05 +0.26 +0.16 +0.38 +0.58 +0.78 +0.38 +0.53 +0.85
Residual 20.91428 4.57321
Number of obs: 180; levels of grouping factors: 18
Fixed-effects parameters:
───────────────────────────────────────────────────
Coef. Std. Error z Pr(>|z|)
───────────────────────────────────────────────────
(Intercept) 256.652 7.35866 34.88 <1e-99
days: 1 7.84395 5.45536 1.44 0.1505
days: 2 8.71009 7.28043 1.20 0.2316
days: 3 26.3402 8.52269 3.09 0.0020
days: 4 31.9976 9.20003 3.48 0.0005
days: 5 51.8666 11.3896 4.55 <1e-05
days: 6 55.5264 14.7053 3.78 0.0002
days: 7 62.0988 10.1329 6.13 <1e-09
days: 8 79.9777 13.3182 6.01 <1e-08
days: 9 94.1994 13.1527 7.16 <1e-12
───────────────────────────────────────────────────
Separating the 1
and days
random effects into separate terms removes the correlations between the intercept and the levels of days
, but not between the levels themselves:
fit(MixedModel, @formula(reaction ~ 1 + days + (1|subj) + (days|subj)), sleepstudy,
contrasts = Dict(:days => DummyCoding()))
Linear mixed model fit by maximum likelihood
reaction ~ 1 + days + (1 | subj) + (days | subj)
logLik -2 logLik AIC AICc BIC
-827.7753 1655.5505 1769.5505 1823.7472 1951.5490
Variance components:
Column Variance Std.Dev. Corr.
subj (Intercept) 790.5131 28.1161
days: 1 0.0000 0.0000 .
days: 2 357.4092 18.9053 . NaN
days: 3 683.6449 26.1466 . NaN +0.81
days: 4 948.2300 30.7933 . NaN +0.57 +0.91
days: 5 1751.2438 41.8479 . NaN +0.26 +0.66 +0.87
days: 6 3353.2201 57.9070 . NaN +0.44 +0.72 +0.80 +0.76
days: 7 1538.5333 39.2241 . NaN +0.39 +0.42 +0.59 +0.62 +0.71
days: 8 2736.0653 52.3074 . NaN +0.21 +0.51 +0.75 +0.93 +0.72 +0.75
days: 9 2768.9574 52.6209 . NaN -0.05 +0.28 +0.57 +0.80 +0.34 +0.52 +0.87
Residual 134.9843 11.6183
Number of obs: 180; levels of grouping factors: 18
Fixed-effects parameters:
───────────────────────────────────────────────────
Coef. Std. Error z Pr(>|z|)
───────────────────────────────────────────────────
(Intercept) 256.652 7.17053 35.79 <1e-99
days: 1 7.84395 3.87276 2.03 0.0428
days: 2 8.71009 5.90375 1.48 0.1401
days: 3 26.3402 7.27863 3.62 0.0003
days: 4 31.9976 8.22665 3.89 0.0001
days: 5 51.8666 10.5967 4.89 <1e-06
days: 6 55.5264 14.1876 3.91 <1e-04
days: 7 62.0988 10.0236 6.20 <1e-09
days: 8 79.9777 12.9229 6.19 <1e-09
days: 9 94.1994 12.9934 7.25 <1e-12
───────────────────────────────────────────────────
(Notice that the variance component for days: 1
is estimated as zero, so the correlations for this component are undefined and expressed as NaN
, not a number.)
An alternative is to force all the levels of days
as indicators using fulldummy
encoding.
MixedModels.fulldummy
— Functionfulldummy(term::CategoricalTerm)
Assign "contrasts" that include all indicator columns (dummy variables) and an intercept column.
This will result in an under-determined set of contrasts, which is not a problem in the random effects because of the regularization, or "shrinkage", of the conditional modes.
The interaction of fulldummy
with complex random effects is subtle and complex with numerous potential edge cases. As we discover these edge cases, we will document and determine their behavior. Until such time, please check the model summary to verify that the expansion is working as you expected. If it is not, please report a use case on GitHub.
fit(MixedModel, @formula(reaction ~ 1 + days + (1 + fulldummy(days)|subj)), sleepstudy,
contrasts = Dict(:days => DummyCoding()))
Linear mixed model fit by maximum likelihood
reaction ~ 1 + days + (1 + days | subj)
logLik -2 logLik AIC AICc BIC
-805.3992 1610.7983 1764.7983 1882.5630 2010.6560
Variance components:
Column Variance Std.Dev. Corr.
subj (Intercept) 817.19219 28.58657
days: 0 688.65447 26.24223 -0.37
days: 1 559.52523 23.65429 -0.25 +0.60
days: 2 509.47902 22.57164 -0.41 +0.24 +0.62
days: 3 325.83235 18.05083 +0.26 -0.27 +0.31 +0.64
days: 4 224.84439 14.99481 +0.76 -0.73 -0.27 -0.17 +0.57
days: 5 923.56039 30.39014 +0.45 -0.43 -0.23 -0.34 +0.07 +0.55
days: 6 2105.84857 45.88953 +0.32 -0.44 -0.40 -0.16 +0.24 +0.50 +0.49
days: 7 1073.83612 32.76944 +0.25 -0.03 -0.16 +0.05 -0.13 +0.06 +0.26 +0.43
days: 8 1709.93937 41.35141 +0.37 -0.35 -0.34 -0.33 -0.10 +0.36 +0.80 +0.50 +0.50
days: 9 2267.34021 47.61660 +0.42 -0.05 -0.05 -0.31 -0.14 +0.31 +0.67 +0.02 +0.30 +0.78
Residual 19.12018 4.37266
Number of obs: 180; levels of grouping factors: 18
Fixed-effects parameters:
───────────────────────────────────────────────────
Coef. Std. Error z Pr(>|z|)
───────────────────────────────────────────────────
(Intercept) 256.652 7.3572 34.88 <1e-99
days: 1 7.84395 5.45655 1.44 0.1506
days: 2 8.71009 7.28322 1.20 0.2317
days: 3 26.3402 8.52528 3.09 0.0020
days: 4 31.9976 9.2093 3.47 0.0005
days: 5 51.8666 11.401 4.55 <1e-05
days: 6 55.5264 14.7132 3.77 0.0002
days: 7 62.0988 10.1343 6.13 <1e-09
days: 8 79.9777 13.3291 6.00 <1e-08
days: 9 94.1994 13.1599 7.16 <1e-12
───────────────────────────────────────────────────
This fit produces a better fit as measured by the objective (negative twice the log-likelihood is 1610.8) but at the expense of adding many more parameters to the model. As a result, model comparison criteria such, as AIC
and BIC
, are inflated.
But using zerocorr
on the individual terms does remove the correlations between the levels:
fit(MixedModel, @formula(reaction ~ 1 + days + zerocorr(1 + days|subj)), sleepstudy,
contrasts = Dict(:days => DummyCoding()))
Linear mixed model fit by maximum likelihood
reaction ~ 1 + days + MixedModels.ZeroCorr((1 + days | subj))
logLik -2 logLik AIC AICc BIC
-882.9138 1765.8276 1807.8276 1813.6757 1874.8797
Variance components:
Column Variance Std.Dev. Corr.
subj (Intercept) 958.6412 30.9619
days: 1 0.0000 0.0000 .
days: 2 0.0000 0.0000 . .
days: 3 0.0000 0.0000 . . .
days: 4 0.0000 0.0000 . . . .
days: 5 519.5873 22.7945 . . . . .
days: 6 1703.7924 41.2770 . . . . . .
days: 7 608.7262 24.6724 . . . . . . .
days: 8 1272.8146 35.6765 . . . . . . . .
days: 9 1753.9941 41.8807 . . . . . . . . .
Residual 434.8773 20.8537
Number of obs: 180; levels of grouping factors: 18
Fixed-effects parameters:
───────────────────────────────────────────────────
Coef. Std. Error z Pr(>|z|)
───────────────────────────────────────────────────
(Intercept) 256.652 8.79873 29.17 <1e-99
days: 1 7.84395 6.95124 1.13 0.2591
days: 2 8.71009 6.95124 1.25 0.2102
days: 3 26.3402 6.95124 3.79 0.0002
days: 4 31.9976 6.95124 4.60 <1e-05
days: 5 51.8666 8.78554 5.90 <1e-08
days: 6 55.5264 11.9572 4.64 <1e-05
days: 7 62.0988 9.06299 6.85 <1e-11
days: 8 79.9777 10.9102 7.33 <1e-12
days: 9 94.1994 12.0733 7.80 <1e-14
───────────────────────────────────────────────────
fit(MixedModel, @formula(reaction ~ 1 + days + (1|subj) + zerocorr(days|subj)), sleepstudy,
contrasts = Dict(:days => DummyCoding()))
Linear mixed model fit by maximum likelihood
reaction ~ 1 + days + (1 | subj) + MixedModels.ZeroCorr((days | subj))
logLik -2 logLik AIC AICc BIC
-882.9138 1765.8276 1807.8276 1813.6757 1874.8797
Variance components:
Column Variance Std.Dev. Corr.
subj (Intercept) 958.6412 30.9619
days: 1 0.0000 0.0000 .
days: 2 0.0000 0.0000 . .
days: 3 0.0000 0.0000 . . .
days: 4 0.0000 0.0000 . . . .
days: 5 519.5873 22.7945 . . . . .
days: 6 1703.7924 41.2770 . . . . . .
days: 7 608.7262 24.6724 . . . . . . .
days: 8 1272.8146 35.6765 . . . . . . . .
days: 9 1753.9941 41.8807 . . . . . . . . .
Residual 434.8773 20.8537
Number of obs: 180; levels of grouping factors: 18
Fixed-effects parameters:
───────────────────────────────────────────────────
Coef. Std. Error z Pr(>|z|)
───────────────────────────────────────────────────
(Intercept) 256.652 8.79873 29.17 <1e-99
days: 1 7.84395 6.95124 1.13 0.2591
days: 2 8.71009 6.95124 1.25 0.2102
days: 3 26.3402 6.95124 3.79 0.0002
days: 4 31.9976 6.95124 4.60 <1e-05
days: 5 51.8666 8.78554 5.90 <1e-08
days: 6 55.5264 11.9572 4.64 <1e-05
days: 7 62.0988 9.06299 6.85 <1e-11
days: 8 79.9777 10.9102 7.33 <1e-12
days: 9 94.1994 12.0733 7.80 <1e-14
───────────────────────────────────────────────────
fit(MixedModel, @formula(reaction ~ 1 + days + zerocorr(1 + fulldummy(days)|subj)), sleepstudy,
contrasts = Dict(:days => DummyCoding()))
Linear mixed model fit by maximum likelihood
reaction ~ 1 + days + MixedModels.ZeroCorr((1 + days | subj))
logLik -2 logLik AIC AICc BIC
-878.9843 1757.9686 1801.9686 1808.4145 1872.2137
Variance components:
Column Variance Std.Dev. Corr.
subj (Intercept) 1135.519525 33.697471
days: 0 775.861870 27.854297 .
days: 1 357.549241 18.908973 . .
days: 2 221.191751 14.872517 . . .
days: 3 0.149095 0.386128 . . . .
days: 4 44.396726 6.663087 . . . . .
days: 5 670.254926 25.889282 . . . . . .
days: 6 1739.753028 41.710347 . . . . . . .
days: 7 908.600753 30.143005 . . . . . . . .
days: 8 1457.229180 38.173671 . . . . . . . . .
days: 9 2027.773402 45.030805 . . . . . . . . . .
Residual 180.953843 13.451909
Number of obs: 180; levels of grouping factors: 18
Fixed-effects parameters:
───────────────────────────────────────────────────
Coef. Std. Error z Pr(>|z|)
───────────────────────────────────────────────────
(Intercept) 256.652 10.7815 23.80 <1e-99
days: 1 7.84395 9.11445 0.86 0.3895
days: 2 8.71009 8.68895 1.00 0.3161
days: 3 26.3402 7.95096 3.31 0.0009
days: 4 31.9976 8.10407 3.95 <1e-04
days: 5 51.8667 10.0223 5.18 <1e-06
days: 6 55.5265 12.6437 4.39 <1e-04
days: 7 62.0988 10.6624 5.82 <1e-08
days: 8 79.9777 12.0069 6.66 <1e-10
days: 9 94.1994 13.2614 7.10 <1e-11
───────────────────────────────────────────────────
Fitting generalized linear mixed models
To create a GLMM representation, the distribution family for the response, and possibly the link function, must be specified.
verbagg = MixedModels.dataset(:verbagg)
verbaggform = @formula(r2 ~ 1 + anger + gender + btype + situ + mode + (1|subj) + (1|item));
gm1 = fit(MixedModel, verbaggform, verbagg, Bernoulli())
Generalized Linear Mixed Model fit by maximum likelihood (nAGQ = 1)
r2 ~ 1 + anger + gender + btype + situ + mode + (1 | subj) + (1 | item)
Distribution: Bernoulli{Float64}
Link: LogitLink()
logLik deviance AIC AICc BIC
-4067.9164 8135.8329 8153.8329 8153.8566 8216.2370
Variance components:
Column Variance Std.Dev.
subj (Intercept) 1.793515 1.339222
item (Intercept) 0.117144 0.342262
Number of obs: 7584; levels of grouping factors: 316, 24
Fixed-effects parameters:
──────────────────────────────────────────────────────
Coef. Std. Error z Pr(>|z|)
──────────────────────────────────────────────────────
(Intercept) -0.152169 0.385222 -0.40 0.6928
anger 0.0573701 0.0167529 3.42 0.0006
gender: M 0.320806 0.191208 1.68 0.0934
btype: scold -1.06011 0.184155 -5.76 <1e-08
btype: shout -2.10406 0.186514 -11.28 <1e-28
situ: self -1.05448 0.151191 -6.97 <1e-11
mode: want 0.706889 0.151004 4.68 <1e-05
──────────────────────────────────────────────────────
The canonical link, which is LogitLink
for the Bernoulli
distribution, is used if no explicit link is specified.
Note that, in keeping with convention in the GLM
package, the distribution family for a binary (i.e. 0/1) response is the Bernoulli
distribution. The Binomial
distribution is only used when the response is the fraction of trials returning a positive, in which case the number of trials must be specified as the case weights.
Optional arguments to fit
An alternative approach is to create the GeneralizedLinearMixedModel
object then call fit!
on it. The optional arguments fast
and/or nAGQ
can be passed to the optimization process via both fit
and fit!
(i.e these optimization settings are not used nor recognized when constructing the model).
As the name implies, fast=true
, provides a faster but somewhat less accurate fit. These fits may suffice for model comparisons.
gm1a = fit(MixedModel, verbaggform, verbagg, Bernoulli(), fast = true)
deviance(gm1a) - deviance(gm1)
0.3380011322660721
@benchmark fit(MixedModel, $verbaggform, $verbagg, Bernoulli())
BenchmarkTools.Trial: 2 samples with 1 evaluation.
Range (min … max): 2.794 s … 2.795 s ┊ GC (min … max): 0.00% … 0.00%
Time (median): 2.794 s ┊ GC (median): 0.00%
Time (mean ± σ): 2.794 s ± 1.044 ms ┊ GC (mean ± σ): 0.00% ± 0.00%
█ █
█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█ ▁
2.79 s Histogram: frequency by time 2.8 s <
Memory estimate: 21.51 MiB, allocs estimate: 356476.
@benchmark fit(MixedModel, $verbaggform, $verbagg, Bernoulli(), fast = true)
BenchmarkTools.Trial: 19 samples with 1 evaluation.
Range (min … max): 261.047 ms … 280.080 ms ┊ GC (min … max): 0.00% … 4.88%
Time (median): 264.845 ms ┊ GC (median): 0.00%
Time (mean ± σ): 265.222 ms ± 4.334 ms ┊ GC (mean ± σ): 0.27% ± 1.12%
█
▅█▅▅▁▁▁▅▁▅▁██▅▅▁▅▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▅▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▅ ▁
261 ms Histogram: frequency by time 280 ms <
Memory estimate: 11.69 MiB, allocs estimate: 83498.
The optional argument nAGQ=k
causes evaluation of the deviance function to use a k
point adaptive Gauss-Hermite quadrature rule. This method only applies to models with a single, simple, scalar random-effects term, such as
contraception = MixedModels.dataset(:contra)
contraform = @formula(use ~ 1 + age + abs2(age) + livch + urban + (1|dist));
bernoulli = Bernoulli()
deviances = Dict{Symbol,Float64}()
b = @benchmarkable deviances[:default] = deviance(fit(MixedModel, $contraform, $contraception, $bernoulli));
run(b)
b = @benchmarkable deviances[:fast] = deviance(fit(MixedModel, $contraform, $contraception, $bernoulli, fast = true));
run(b)
b = @benchmarkable deviances[:nAGQ] = deviance(fit(MixedModel, $contraform, $contraception, $bernoulli, nAGQ=9));
run(b)
b = @benchmarkable deviances[:nAGQ_fast] = deviance(fit(MixedModel, $contraform, $contraception, $bernoulli, nAGQ=9, fast=true));
run(b)
sort(deviances)
OrderedCollections.OrderedDict{Symbol, Float64} with 4 entries:
:default => 2372.73
:fast => 2372.78
:nAGQ => 2372.46
:nAGQ_fast => 2372.51
Extractor functions
LinearMixedModel
and GeneralizedLinearMixedModel
are subtypes of StatsBase.RegressionModel
which, in turn, is a subtype of StatsBase.StatisticalModel
. Many of the generic extractors defined in the StatsBase
package have methods for these models.
Model-fit statistics
The statistics describing the quality of the model fit include
StatsAPI.loglikelihood
— Methodloglikelihood(d::Distribution{ArrayLikeVariate{N}}, x) where {N}
The log-likelihood of distribution d
with respect to all variate(s) contained in x
.
Here, x
can be any output of rand(d, dims...)
and rand!(d, x)
. For instance, x
can be
- an array of dimension
N
withsize(x) == size(d)
, - an array of dimension
N + 1
withsize(x)[1:N] == size(d)
, or - an array of arrays
xi
of dimensionN
withsize(xi) == size(d)
.
StatsAPI.aic
— Functionaic(model::StatisticalModel)
Akaike's Information Criterion, defined as $-2 \log L + 2k$, with $L$ the likelihood of the model, and k
its number of consumed degrees of freedom (as returned by dof
).
StatsAPI.bic
— FunctionMissing docstring for dof(::StatisticalModel)
. Check Documenter's build log for details.
Missing docstring for nobs(::StatisticalModel)
. Check Documenter's build log for details.
loglikelihood(fm1)
-163.66352994056504
aic(fm1)
333.3270598811301
bic(fm1)
337.53065202611657
dof(fm1) # 1 fixed effect, 2 variances
3
nobs(fm1) # 30 observations
30
loglikelihood(gm1)
-4067.916435286866
In general the deviance
of a statistical model fit is negative twice the log-likelihood adjusting for the saturated model.
StatsAPI.deviance
— Methoddeviance(m::GeneralizedLinearMixedModel{T}, nAGQ=1)::T where {T}
Return the deviance of m
evaluated by the Laplace approximation (nAGQ=1
) or nAGQ
-point adaptive Gauss-Hermite quadrature.
If the distribution D
does not have a scale parameter the Laplace approximation is the squared length of the conditional modes, $u$, plus the determinant of $Λ'Z'WZΛ + I$, plus the sum of the squared deviance residuals.
Because it is not clear what the saturated model corresponding to a particular LinearMixedModel
should be, negative twice the log-likelihood is called the objective
.
MixedModels.objective
— Functionobjective(m::LinearMixedModel)
Return negative twice the log-likelihood of model m
This value is also accessible as the deviance
but the user should bear in mind that this doesn't have all the properties of a deviance which is corrected for the saturated model. For example, it is not necessarily non-negative.
objective(fm1)
327.3270598811301
deviance(fm1)
327.3270598811301
The value optimized when fitting a GeneralizedLinearMixedModel
is the Laplace approximation to the deviance or an adaptive Gauss-Hermite evaluation.
MixedModels.deviance!
— Functiondeviance!(m::GeneralizedLinearMixedModel, nAGQ=1)
Update m.η
, m.μ
, etc., install the working response and working weights in m.LMM
, update m.LMM.A
and m.LMM.R
, then evaluate the deviance
.
MixedModels.deviance!(gm1)
8135.832870573721
Fixed-effects parameter estimates
The coef
and fixef
extractors both return the maximum likelihood estimates of the fixed-effects coefficients. They differ in their behavior in the rank-deficient case. The associated coefnames
and fixefnames
return the corresponding coefficient names.
StatsAPI.coef
— Functioncoef(model::StatisticalModel)
Return the coefficients of the model.
StatsAPI.coefnames
— Functioncoefnames(model::StatisticalModel)
Return the names of the coefficients.
MixedModels.fixef
— Functionfixef(m::MixedModel)
Return the fixed-effects parameter vector estimate of m
.
In the rank-deficient case the truncated parameter vector, of length rank(m)
is returned. This is unlike coef
which always returns a vector whose length matches the number of columns in X
.
MixedModels.fixefnames
— Functionfixefnames(m::MixedModel)
Return a (permuted and truncated in the rank-deficient case) vector of coefficient names.
coef(fm1)
coefnames(fm1)
1-element Vector{String}:
"(Intercept)"
fixef(fm1)
fixefnames(fm1)
1-element Vector{String}:
"(Intercept)"
An alternative extractor for the fixed-effects coefficient is the β
property. Properties whose names are Greek letters usually have an alternative spelling, which is the name of the Greek letter.
fm1.β
1-element Vector{Float64}:
1527.5000000000007
fm1.beta
1-element Vector{Float64}:
1527.5000000000007
gm1.β
7-element Vector{Float64}:
-0.15216867026207292
0.05737014104778681
0.3208058461671894
-1.0601090050817565
-2.104063072484944
-1.0544815542347317
0.7068886923883283
A full list of property names is returned by propertynames
propertynames(fm1)
(:formula, :reterms, :Xymat, :feterm, :sqrtwts, :parmap, :dims, :A, :L, :optsum, :θ, :theta, :β, :beta, :βs, :betas, :λ, :lambda, :stderror, :σ, :sigma, :σs, :sigmas, :σρs, :sigmarhos, :b, :u, :lowerbd, :X, :y, :corr, :vcov, :PCA, :rePCA, :objective, :pvalues)
propertynames(gm1)
(:A, :L, :theta, :beta, :coef, :λ, :σ, :sigma, :X, :y, :lowerbd, :objective, :σρs, :σs, :corr, :vcov, :PCA, :rePCA, :LMM, :β, :θ, :b, :u, :resp, :wt)
The variance-covariance matrix of the fixed-effects coefficients is returned by
StatsAPI.vcov
— Functionvcov(model::StatisticalModel)
Return the variance-covariance matrix for the coefficients of the model.
vcov(fm2)
2×2 Matrix{Float64}:
43.9868 -1.37039
-1.37039 2.25671
vcov(gm1)
7×7 Matrix{Float64}:
0.148396 -0.00561495 -0.00982354 … -0.0112068 -0.0113465
-0.00561495 0.000280659 7.19128e-5 -1.47959e-5 1.02403e-5
-0.00982354 7.19128e-5 0.0365607 -8.04452e-5 5.25876e-5
-0.0167982 -1.43707e-5 -9.25673e-5 0.000265817 -0.0001721
-0.0166222 -2.90541e-5 -0.000162396 0.000658983 -0.000520519
-0.0112068 -1.47959e-5 -8.04452e-5 … 0.0228588 -0.000247782
-0.0113465 1.02403e-5 5.25876e-5 -0.000247782 0.0228022
The standard errors are the square roots of the diagonal elements of the estimated variance-covariance matrix of the fixed-effects coefficient estimators.
StatsAPI.stderror
— Functionstderror(model::StatisticalModel)
Return the standard errors for the coefficients of the model.
stderror(fm2)
2-element Vector{Float64}:
6.632257772788787
1.5022354780057847
stderror(gm1)
7-element Vector{Float64}:
0.38522236254727515
0.016752880012356943
0.19120847766637172
0.1841548619831051
0.18651361638246777
0.15119121882520364
0.15100412708712097
Finally, the coeftable
generic produces a table of coefficient estimates, their standard errors, and their ratio. The p-values quoted here should be regarded as approximations.
StatsAPI.coeftable
— Functioncoeftable(model::StatisticalModel; level::Real=0.95)
Return a table with coefficients and related statistics of the model. level
determines the level for confidence intervals (by default, 95%).
The returned CoefTable
object implements the Tables.jl interface, and can be converted e.g. to a DataFrame
via using DataFrames; DataFrame(coeftable(model))
.
coeftable(fm2)
──────────────────────────────────────────────────
Coef. Std. Error z Pr(>|z|)
──────────────────────────────────────────────────
(Intercept) 251.405 6.63226 37.91 <1e-99
days 10.4673 1.50224 6.97 <1e-11
──────────────────────────────────────────────────
Covariance parameter estimates
The covariance parameters estimates, in the form shown in the model summary, are a VarCorr
object
VarCorr(fm2)
Variance components:
Column Variance Std.Dev. Corr.
subj (Intercept) 565.51068 23.78047
days 32.68212 5.71683 +0.08
Residual 654.94145 25.59182
VarCorr(gm1)
Variance components:
Column Variance Std.Dev.
subj (Intercept) 1.793515 1.339222
item (Intercept) 0.117144 0.342262
Individual components are returned by other extractors
MixedModels.varest
— Functionvarest(m::LinearMixedModel)
Returns the estimate of σ², the variance of the conditional distribution of Y given B.
varest(m::GeneralizedLinearMixedModel)
Returns the estimate of ϕ², the variance of the conditional distribution of Y given B.
For models with a dispersion parameter ϕ, this is simply ϕ². For models without a dispersion parameter, this value is missing
. This differs from disperion
, which returns 1
for models without a dispersion parameter.
For Gaussian models, this parameter is often called σ².
MixedModels.sdest
— Functionsdest(m::LinearMixedModel)
Return the estimate of σ, the standard deviation of the per-observation noise.
sdest(m::GeneralizedLinearMixedModel)
Return the estimate of the dispersion, i.e. the standard deviation of the per-observation noise.
For models with a dispersion parameter ϕ, this is simply ϕ. For models without a dispersion parameter, this value is missing
. This differs from disperion
, which returns 1
for models without a dispersion parameter.
For Gaussian models, this parameter is often called σ.
varest(fm2)
654.9414509863424
sdest(fm2)
25.591823908942917
fm2.σ
25.591823908942917
Conditional modes of the random effects
The ranef
extractor
MixedModels.ranef
— Functionranef(m::LinearMixedModel; uscale=false)
Return, as a Vector{Matrix{T}}
, the conditional modes of the random effects in model m
.
If uscale
is true
the random effects are on the spherical (i.e. u
) scale, otherwise on the original scale.
For a named variant, see raneftables
.
ranef(fm1)
1-element Vector{Matrix{Float64}}:
[-16.628222043024927 0.3695160454000294 … 53.57982658307787 -42.49434522106282]
fm1.b
1-element Vector{Matrix{Float64}}:
[-16.628222043024927 0.3695160454000294 … 53.57982658307787 -42.49434522106282]
returns the conditional modes of the random effects given the observed data. That is, these are the values that maximize the conditional density of the random effects given the observed data. For a LinearMixedModel
these are also the conditional means.
These are sometimes called the best linear unbiased predictors or BLUPs
but that name is not particularly meaningful.
At a superficial level these can be considered as the "estimates" of the random effects, with a bit of hand waving, but pursuing this analogy too far usually results in confusion.
To obtain tables associating the values of the conditional modes with the levels of the grouping factor, use
MixedModels.raneftables
— Functionraneftables(m::MixedModel; uscale = false)
Return the conditional means of the random effects as a NamedTuple
of columntables
as in
DataFrame(only(raneftables(fm1)))
6 rows × 2 columns
batch | (Intercept) | |
---|---|---|
String | Float64 | |
1 | A | -16.6282 |
2 | B | 0.369516 |
3 | C | 26.9747 |
4 | D | -21.8014 |
5 | E | 53.5798 |
6 | F | -42.4943 |
The corresponding conditional variances are returned by
MixedModels.condVar
— FunctioncondVar(m::LinearMixedModel)
Return the conditional variances matrices of the random effects.
The random effects are returned by ranef
as a vector of length k
, where k
is the number of random effects terms. The i
th element is a matrix of size vᵢ × ℓᵢ
where vᵢ
is the size of the vector-valued random effects for each of the ℓᵢ
levels of the grouping factor. Technically those values are the modes of the conditional distribution of the random effects given the observed data.
This function returns an array of k
three dimensional arrays, where the i
th array is of size vᵢ × vᵢ × ℓᵢ
. These are the diagonal blocks from the conditional variance-covariance matrix,
s² Λ(Λ'Z'ZΛ + I)⁻¹Λ'
condVar(fm1)
1-element Vector{Array{Float64, 3}}:
[362.3104773058705]
[362.3104773058705]
[362.3104773058705]
[362.3104773058705]
[362.3104773058705]
[362.3104773058705]
Case-wise diagnostics and residual degrees of freedom
The leverage
values
StatsAPI.leverage
— Functionleverage(model::RegressionModel)
Return the diagonal of the projection matrix of the model.
leverage(fm1)
30-element Vector{Float64}:
0.15650534846684694
0.15650534846684694
0.15650534846684694
0.15650534846684694
0.15650534846684694
0.15650534846684694
0.15650534846684694
0.15650534846684694
0.15650534846684694
0.15650534846684694
⋮
0.15650534846684694
0.15650534846684694
0.15650534846684694
0.15650534846684694
0.15650534846684694
0.15650534846684694
0.15650534846684694
0.15650534846684694
0.15650534846684694
are used in diagnostics for linear regression models to determine cases that exert a strong influence on their own predicted response.
The documentation refers to a "projection". For a linear model without random effects the fitted values are obtained by orthogonal projection of the response onto the column span of the model matrix and the sum of the leverage values is the dimension of this column span. That is, the sum of the leverage values is the rank of the model matrix and n - sum(leverage(m))
is the degrees of freedom for residuals. The sum of the leverage values is also the trace of the so-called "hat" matrix, H
. (The name "hat matrix" reflects the fact that $\hat{\mathbf{y}} = \mathbf{H} \mathbf{y}$. That is, H
puts a hat on y
.)
For a linear mixed model the sum of the leverage values will be between p
, the rank of the fixed-effects model matrix, and p + q
where q
is the total number of random effects. This number does not represent a dimension (or "degrees of freedom") of a linear subspace of all possible fitted values because the projection is not an orthogonal projection. Nevertheless, it is a reasonable measure of the effective degrees of freedom of the model and n - sum(leverage(m))
can be considered the effective residual degrees of freedom.
For model fm1
the dimensions are
n, p, q, k = size(fm1)
(30, 1, 6, 1)
which implies that the sum of the leverage values should be in the range [1, 7]. The actual value is
sum(leverage(fm1))
4.695160454005408
For model fm2
the dimensions are
n, p, q, k = size(fm2)
(180, 2, 36, 1)
providing a range of [2, 38] for the effective degrees of freedom for the model. The observed value is
sum(leverage(fm2))
28.611526085220852
When a model converges to a singular covariance, such as
fm3 = fit(MixedModel, @formula(yield ~ 1+(1|batch)), MixedModels.dataset(:dyestuff2))
Linear mixed model fit by maximum likelihood
yield ~ 1 + (1 | batch)
logLik -2 logLik AIC AICc BIC
-81.4365 162.8730 168.8730 169.7961 173.0766
Variance components:
Column Variance Std.Dev.
batch (Intercept) 0.00000 0.00000
Residual 13.34610 3.65323
Number of obs: 30; levels of grouping factors: 6
Fixed-effects parameters:
───────────────────────────────────────────────
Coef. Std. Error z Pr(>|z|)
───────────────────────────────────────────────
(Intercept) 5.6656 0.666986 8.49 <1e-16
───────────────────────────────────────────────
the effective degrees of freedom is the lower bound.
sum(leverage(fm3))
0.9999999999999998
Models for which the estimates of the variances of the random effects are large relative to the residual variance have effective degrees of freedom close to the upper bound.
fm4 = fit(MixedModel, @formula(diameter ~ 1+(1|plate)+(1|sample)),
MixedModels.dataset(:penicillin))
Linear mixed model fit by maximum likelihood
diameter ~ 1 + (1 | plate) + (1 | sample)
logLik -2 logLik AIC AICc BIC
-166.0942 332.1883 340.1883 340.4761 352.0676
Variance components:
Column Variance Std.Dev.
plate (Intercept) 0.714979 0.845565
sample (Intercept) 3.135193 1.770648
Residual 0.302426 0.549933
Number of obs: 144; levels of grouping factors: 24, 6
Fixed-effects parameters:
─────────────────────────────────────────────────
Coef. Std. Error z Pr(>|z|)
─────────────────────────────────────────────────
(Intercept) 22.9722 0.744596 30.85 <1e-99
─────────────────────────────────────────────────
sum(leverage(fm4))
27.46531790663849
Also, a model fit by the REML criterion generally has larger estimates of the variance components and hence a larger effective degrees of freedom.
fm4r = fit(MixedModel, @formula(diameter ~ 1+(1|plate)+(1|sample)),
MixedModels.dataset(:penicillin), REML=true)
Linear mixed model fit by REML
diameter ~ 1 + (1 | plate) + (1 | sample)
REML criterion at convergence: 330.86058899108457
Variance components:
Column Variance Std.Dev.
plate (Intercept) 0.716908 0.846704
sample (Intercept) 3.730901 1.931554
Residual 0.302416 0.549923
Number of obs: 144; levels of grouping factors: 24, 6
Fixed-effects parameters:
─────────────────────────────────────────────────
Coef. Std. Error z Pr(>|z|)
─────────────────────────────────────────────────
(Intercept) 22.9722 0.808572 28.41 <1e-99
─────────────────────────────────────────────────
sum(leverage(fm4r))
27.47236135264903