Parametric bootstrap for mixed-effects models

Julia is well-suited to implementing bootstrapping and other simulation-based methods for statistical models. The parametricbootstrap function in the MixedModels package provides an efficient parametric bootstrap for mixed-effects models.

MixedModels.parametricbootstrap — Function

parametricbootstrap([rng::AbstractRNG], nsamp::Integer, m::MixedModel{T}, ftype=T;
    β = coef(m), σ = m.σ, θ = m.θ, use_threads=false, hide_progress=false)

Perform nsamp parametric bootstrap replication fits of m, returning a MixedModelBootstrap.

The default random number generator is Random.GLOBAL_RNG.

ftype can be used to store the computed bootstrap values in a lower precision. ftype is not a named argument because named arguments are not used in method dispatch and thus specialization. In other words, having ftype as a positional argument has some potential performance benefits.

Named Arguments

β, σ, and θ are the values of m's parameters for simulating the responses.
σ is only valid for LinearMixedModel and GeneralizedLinearMixedModel for

families with a dispersion parameter.

use_threads determines whether or not to use thread-based parallelism.
hide_progress can be used to disable the progress bar. Note that the progress

bar is automatically disabled for non-interactive (i.e. logging) contexts.

Note

Note that use_threads=true may not offer a performance boost and may even decrease peformance if multithreaded linear algebra (BLAS) routines are available. In this case, threads at the level of the linear algebra may already occupy all processors/processor cores. There are plans to provide better support in coordinating Julia- and BLAS-level threads in the future.

Warning

The PRNG shared between threads is locked using Threads.SpinLock, which should not be used recursively. Do not wrap parametricbootstrap in an outer SpinLock.

source

The parametric bootstrap

Bootstrapping is a family of procedures for generating sample values of a statistic, allowing for visualization of the distribution of the statistic or for inference from this sample of values.

A parametric bootstrap is used with a parametric model, m, that has been fit to data. The procedure is to simulate n response vectors from m using the estimated parameter values and refit m to these responses in turn, accumulating the statistics of interest at each iteration.

The parameters of a LinearMixedModel object are the fixed-effects parameters, β, the standard deviation, σ, of the per-observation noise, and the covariance parameter, θ, that defines the variance-covariance matrices of the random effects.

For example, a simple linear mixed-effects model for the Dyestuff data in the lme4 package for R is fit by

using DataFrames
using Gadfly          # plotting package
using MixedModels
using Random

dyestuff = MixedModels.dataset(:dyestuff)
m1 = fit(MixedModel, @formula(yield ~ 1 + (1 | batch)), dyestuff)

Linear mixed model fit by maximum likelihood
 yield ~ 1 + (1 | batch)
   logLik   -2 logLik     AIC       AICc        BIC    
  -163.6635   327.3271   333.3271   334.2501   337.5307

Variance components:
            Column    Variance Std.Dev.
batch    (Intercept)  1388.3334 37.2603
Residual              2451.2500 49.5101
 Number of obs: 30; levels of grouping factors: 6

  Fixed-effects parameters:
────────────────────────────────────────────────
              Coef.  Std. Error      z  Pr(>|z|)
────────────────────────────────────────────────
(Intercept)  1527.5     17.6946  86.33    <1e-99
────────────────────────────────────────────────

To bootstrap the model parameters, first initialize a random number generator then create a bootstrap sample

const rng = MersenneTwister(1234321);
samp = parametricbootstrap(rng, 10_000, m1);
df = DataFrame(samp.allpars);
first(df, 10)

10 rows × 5 columns

	iter	type	group	names	value
	Int64	String	String?	String?	Float64
1	1	β	missing	(Intercept)	1509.13
2	1	σ	batch	(Intercept)	14.312
3	1	σ	residual	missing	67.4315
4	2	β	missing	(Intercept)	1538.08
5	2	σ	batch	(Intercept)	25.5673
6	2	σ	residual	missing	47.9831
7	3	β	missing	(Intercept)	1508.02
8	3	σ	batch	(Intercept)	21.7622
9	3	σ	residual	missing	50.1346
10	4	β	missing	(Intercept)	1538.47

Especially for those with a background in R or pandas, the simplest way of accessing the parameter estimates in the parametric bootstrap object is to create a DataFrame from the allpars property as shown above.

We can use filter to filter out relevant rows of a dataframe. A density plot of the estimates of σ, the residual standard deviation, can be created as

σres = filter(df) do row # create a thunk that operates on rows
    row.type == "σ" && row.group == "residual" # our filtering rule
end

plot(x = σres.value, Geom.density, Guide.xlabel("Parametric bootstrap estimates of σ"))

For the estimates of the intercept parameter, the getproperty extractor must be used

plot(filter(:type => ==("β"),  df), x = :value, Geom.density, Guide.xlabel("Parametric bootstrap estimates of β₁"))

A density plot of the estimates of the standard deviation of the random effects is obtained as

σbatch = filter(df) do row # create a thunk that operates on rows
    row.type == "σ" && row.group == "batch" # our filtering rule
end
plot(x = σbatch.value, Geom.density,
    Guide.xlabel("Parametric bootstrap estimates of σ₁"))

Notice that this density plot has a spike, or mode, at zero. Although this mode appears to be diffuse, this is an artifact of the way that density plots are created. In fact, it is a pulse, as can be seen from a histogram.

plot(x = σbatch.value, Geom.histogram,
    Guide.xlabel("Parametric bootstrap estimates of σ₁"))

The bootstrap sample can be used to generate intervals that cover a certain percentage of the bootstrapped values. We refer to these as "coverage intervals", similar to a confidence interval. The shortest such intervals, obtained with the shortestcovint extractor, correspond to a highest posterior density interval in Bayesian inference.

MixedModels.shortestcovint — Function

shortestcovint(v, level = 0.95)

Return the shortest interval containing level proportion of the values of v

source

shortestcovint(bsamp::MixedModelFitCollection, level = 0.95)

Return the shortest interval containing level proportion for each parameter from bsamp.allpars.

Warning

Currently, correlations that are systematically zero are included in the the result. This may change in a future release without being considered a breaking change.

source

We generate these for all random and fixed effects:

combine(groupby(df, [:type, :group, :names]), :value => shortestcovint => :interval)

3 rows × 4 columns

	type	group	names	interval
	String	String?	String?	Tuple…
1	β	missing	(Intercept)	(1492.54, 1561.34)
2	σ	batch	(Intercept)	(0.0, 54.7042)
3	σ	residual	missing	(35.4909, 63.0209)

We can also generate this directly from the original bootstrap object:

DataFrame(shortestcovint(samp))

3 rows × 5 columns

	type	group	names	lower	upper
	String	String?	String?	Float64	Float64
1	β	missing	(Intercept)	1492.54	1561.34
2	σ	batch	(Intercept)	0.0	54.7042
3	σ	residual	missing	35.4909	63.0209

A value of zero for the standard deviation of the random effects is an example of a singular covariance. It is easy to detect the singularity in the case of a scalar random-effects term. However, it is not as straightforward to detect singularity in vector-valued random-effects terms.

For example, if we bootstrap a model fit to the sleepstudy data

sleepstudy = MixedModels.dataset(:sleepstudy)
m2 = fit(
    MixedModel,
    @formula(reaction ~ 1+days+(1+days|subj)),
    sleepstudy,
)

Linear mixed model fit by maximum likelihood
 reaction ~ 1 + days + (1 + days | subj)
   logLik   -2 logLik     AIC       AICc        BIC    
  -875.9697  1751.9393  1763.9393  1764.4249  1783.0971

Variance components:
            Column    Variance Std.Dev.   Corr.
subj     (Intercept)  565.51068 23.78047
         days          32.68212  5.71683 +0.08
Residual              654.94145 25.59182
 Number of obs: 180; levels of grouping factors: 18

  Fixed-effects parameters:
──────────────────────────────────────────────────
                Coef.  Std. Error      z  Pr(>|z|)
──────────────────────────────────────────────────
(Intercept)  251.405      6.63226  37.91    <1e-99
days          10.4673     1.50224   6.97    <1e-11
──────────────────────────────────────────────────

samp2 = parametricbootstrap(rng, 10_000, m2, use_threads=true);
df2 = DataFrame(samp2.allpars);
first(df2, 10)

10 rows × 5 columns

	iter	type	group	names	value
	Int64	String	String?	String?	Float64
1	1	β	missing	(Intercept)	262.397
2	1	β	missing	days	10.5117
3	1	σ	subj	(Intercept)	24.8119
4	1	σ	subj	days	6.86013
5	1	ρ	subj	(Intercept), days	-0.335778
6	1	σ	residual	missing	26.2691
7	2	β	missing	(Intercept)	252.753
8	2	β	missing	days	10.9396
9	2	σ	subj	(Intercept)	15.4139
10	2	σ	subj	days	5.27726

the singularity can be exhibited as a standard deviation of zero or as a correlation of $\pm1$.

DataFrame(shortestcovint(samp2))

6 rows × 5 columns

	type	group	names	lower	upper
	String	String?	String?	Float64	Float64
1	β	missing	(Intercept)	238.583	263.892
2	β	missing	days	7.53964	13.3682
3	σ	subj	(Intercept)	10.508	33.2184
4	σ	subj	days	3.02493	7.67527
5	ρ	subj	(Intercept), days	-0.405358	1.0
6	σ	residual	missing	22.6568	28.4312

A histogram of the estimated correlations from the bootstrap sample has a spike at +1.

ρs = filter(df2) do row
    row.type == "ρ" && row.group == "subj"
end
plot(x = ρs.value, Geom.histogram,
    Guide.xlabel("Parametric bootstrap samples of correlation of random effects"))

or, as a count,

count(ρs.value .≈ 1)

Close examination of the histogram shows a few values of -1.

count(ρs.value .≈ -1)

Furthermore there are even a few cases where the estimate of the standard deviation of the random effect for the intercept is zero.

σs = filter(df2) do row
    row.type == "σ" && row.group == "subj" && row.names == "(Intercept)"
end
count(σs.value .≈ 0)

There is a general condition to check for singularity of an estimated covariance matrix or matrices in a bootstrap sample. The parameter optimized in the estimation is θ, the relative covariance parameter. Some of the elements of this parameter vector must be non-negative and, when one of these components is approximately zero, one of the covariance matrices will be singular.

The issingular method for a MixedModel object that tests if a parameter vector θ corresponds to a boundary or singular fit.

This operation is encapsulated in a method for the issingular function.

count(issingular(samp2))