Parametric bootstrap for mixed-effects models

Julia is well-suited to implementing bootstrapping and other simulation-based methods for statistical models. The parametricbootstrap function in the MixedModels package provides an efficient parametric bootstrap for mixed-effects models.

MixedModels.parametricbootstrapFunction
parametricbootstrap([rng::AbstractRNG], nsamp::Integer, m::MixedModel{T}, ftype=T;
    β = coef(m), σ = m.σ, θ = m.θ, use_threads=false, hide_progress=false)

Perform nsamp parametric bootstrap replication fits of m, returning a MixedModelBootstrap.

The default random number generator is Random.GLOBAL_RNG.

ftype can be used to store the computed bootstrap values in a lower precision. ftype is not a named argument because named arguments are not used in method dispatch and thus specialization. In other words, having ftype as a positional argument has some potential performance benefits.

Named Arguments

  • β, σ, and θ are the values of m's parameters for simulating the responses.
  • σ is only valid for LinearMixedModel and GeneralizedLinearMixedModel for

families with a dispersion parameter.

  • use_threads determines whether or not to use thread-based parallelism.
  • hide_progress can be used to disable the progress bar. Note that the progress

bar is automatically disabled for non-interactive (i.e. logging) contexts.

Note

Note that use_threads=true may not offer a performance boost and may even decrease peformance if multithreaded linear algebra (BLAS) routines are available. In this case, threads at the level of the linear algebra may already occupy all processors/processor cores. There are plans to provide better support in coordinating Julia- and BLAS-level threads in the future.

Warning

The PRNG shared between threads is locked using Threads.SpinLock, which should not be used recursively. Do not wrap parametricbootstrap in an outer SpinLock.

source

The parametric bootstrap

Bootstrapping is a family of procedures for generating sample values of a statistic, allowing for visualization of the distribution of the statistic or for inference from this sample of values.

A parametric bootstrap is used with a parametric model, m, that has been fit to data. The procedure is to simulate n response vectors from m using the estimated parameter values and refit m to these responses in turn, accumulating the statistics of interest at each iteration.

The parameters of a LinearMixedModel object are the fixed-effects parameters, β, the standard deviation, σ, of the per-observation noise, and the covariance parameter, θ, that defines the variance-covariance matrices of the random effects.

For example, a simple linear mixed-effects model for the Dyestuff data in the lme4 package for R is fit by

using DataFrames
using Gadfly          # plotting package
using MixedModels
using Random
dyestuff = MixedModels.dataset(:dyestuff)
m1 = fit(MixedModel, @formula(yield ~ 1 + (1 | batch)), dyestuff)
Linear mixed model fit by maximum likelihood
 yield ~ 1 + (1 | batch)
   logLik   -2 logLik     AIC       AICc        BIC    
  -163.6635   327.3271   333.3271   334.2501   337.5307

Variance components:
            Column    Variance Std.Dev.
batch    (Intercept)  1388.3334 37.2603
Residual              2451.2500 49.5101
 Number of obs: 30; levels of grouping factors: 6

  Fixed-effects parameters:
────────────────────────────────────────────────
              Coef.  Std. Error      z  Pr(>|z|)
────────────────────────────────────────────────
(Intercept)  1527.5     17.6946  86.33    <1e-99
────────────────────────────────────────────────

To bootstrap the model parameters, first initialize a random number generator then create a bootstrap sample

const rng = MersenneTwister(1234321);
samp = parametricbootstrap(rng, 10_000, m1);
df = DataFrame(samp.allpars);
first(df, 10)

10 rows × 5 columns

itertypegroupnamesvalue
Int64StringString?String?Float64
11βmissing(Intercept)1509.13
21σbatch(Intercept)14.312
31σresidualmissing67.4315
42βmissing(Intercept)1538.08
52σbatch(Intercept)25.5673
62σresidualmissing47.9831
73βmissing(Intercept)1508.02
83σbatch(Intercept)21.7622
93σresidualmissing50.1346
104βmissing(Intercept)1538.47

Especially for those with a background in R or pandas, the simplest way of accessing the parameter estimates in the parametric bootstrap object is to create a DataFrame from the allpars property as shown above.

We can use filter to filter out relevant rows of a dataframe. A density plot of the estimates of σ, the residual standard deviation, can be created as

σres = filter(df) do row # create a thunk that operates on rows
    row.type == "σ" && row.group == "residual" # our filtering rule
end

plot(x = σres.value, Geom.density, Guide.xlabel("Parametric bootstrap estimates of σ"))
Parametric bootstrap estimates of σ -125 -100 -75 -50 -25 0 25 50 75 100 125 150 175 200 225 -100 -90 -80 -70 -60 -50 -40 -30 -20 -10 0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 -100 0 100 200 -100 -95 -90 -85 -80 -75 -70 -65 -60 -55 -50 -45 -40 -35 -30 -25 -20 -15 -10 -5 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105 110 115 120 125 130 135 140 145 150 155 160 165 170 175 180 185 190 195 200 h,j,k,l,arrows,drag to pan i,o,+,-,scroll,shift-drag to zoom r,dbl-click to reset c for coordinates ? for help ? -0.07 -0.06 -0.05 -0.04 -0.03 -0.02 -0.01 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.11 0.12 0.13 -0.060 -0.055 -0.050 -0.045 -0.040 -0.035 -0.030 -0.025 -0.020 -0.015 -0.010 -0.005 0.000 0.005 0.010 0.015 0.020 0.025 0.030 0.035 0.040 0.045 0.050 0.055 0.060 0.065 0.070 0.075 0.080 0.085 0.090 0.095 0.100 0.105 0.110 0.115 0.120 -0.1 0.0 0.1 0.2 -0.060 -0.058 -0.056 -0.054 -0.052 -0.050 -0.048 -0.046 -0.044 -0.042 -0.040 -0.038 -0.036 -0.034 -0.032 -0.030 -0.028 -0.026 -0.024 -0.022 -0.020 -0.018 -0.016 -0.014 -0.012 -0.010 -0.008 -0.006 -0.004 -0.002 0.000 0.002 0.004 0.006 0.008 0.010 0.012 0.014 0.016 0.018 0.020 0.022 0.024 0.026 0.028 0.030 0.032 0.034 0.036 0.038 0.040 0.042 0.044 0.046 0.048 0.050 0.052 0.054 0.056 0.058 0.060 0.062 0.064 0.066 0.068 0.070 0.072 0.074 0.076 0.078 0.080 0.082 0.084 0.086 0.088 0.090 0.092 0.094 0.096 0.098 0.100 0.102 0.104 0.106 0.108 0.110 0.112 0.114 0.116 0.118 0.120

For the estimates of the intercept parameter, the getproperty extractor must be used

plot(filter(:type => ==("β"),  df), x = :value, Geom.density, Guide.xlabel("Parametric bootstrap estimates of β₁"))
Parametric bootstrap estimates of β₁ 1100 1150 1200 1250 1300 1350 1400 1450 1500 1550 1600 1650 1700 1750 1800 1850 1900 1950 1140 1160 1180 1200 1220 1240 1260 1280 1300 1320 1340 1360 1380 1400 1420 1440 1460 1480 1500 1520 1540 1560 1580 1600 1620 1640 1660 1680 1700 1720 1740 1760 1780 1800 1820 1840 1860 1880 1900 1000 1500 2000 1150 1160 1170 1180 1190 1200 1210 1220 1230 1240 1250 1260 1270 1280 1290 1300 1310 1320 1330 1340 1350 1360 1370 1380 1390 1400 1410 1420 1430 1440 1450 1460 1470 1480 1490 1500 1510 1520 1530 1540 1550 1560 1570 1580 1590 1600 1610 1620 1630 1640 1650 1660 1670 1680 1690 1700 1710 1720 1730 1740 1750 1760 1770 1780 1790 1800 1810 1820 1830 1840 1850 1860 1870 1880 1890 1900 h,j,k,l,arrows,drag to pan i,o,+,-,scroll,shift-drag to zoom r,dbl-click to reset c for coordinates ? for help ? -0.030 -0.025 -0.020 -0.015 -0.010 -0.005 0.000 0.005 0.010 0.015 0.020 0.025 0.030 0.035 0.040 0.045 0.050 0.055 -0.026 -0.024 -0.022 -0.020 -0.018 -0.016 -0.014 -0.012 -0.010 -0.008 -0.006 -0.004 -0.002 0.000 0.002 0.004 0.006 0.008 0.010 0.012 0.014 0.016 0.018 0.020 0.022 0.024 0.026 0.028 0.030 0.032 0.034 0.036 0.038 0.040 0.042 0.044 0.046 0.048 0.050 0.052 -0.05 0.00 0.05 -0.025 -0.024 -0.023 -0.022 -0.021 -0.020 -0.019 -0.018 -0.017 -0.016 -0.015 -0.014 -0.013 -0.012 -0.011 -0.010 -0.009 -0.008 -0.007 -0.006 -0.005 -0.004 -0.003 -0.002 -0.001 0.000 0.001 0.002 0.003 0.004 0.005 0.006 0.007 0.008 0.009 0.010 0.011 0.012 0.013 0.014 0.015 0.016 0.017 0.018 0.019 0.020 0.021 0.022 0.023 0.024 0.025 0.026 0.027 0.028 0.029 0.030 0.031 0.032 0.033 0.034 0.035 0.036 0.037 0.038 0.039 0.040 0.041 0.042 0.043 0.044 0.045 0.046 0.047 0.048 0.049 0.050 0.051

A density plot of the estimates of the standard deviation of the random effects is obtained as

σbatch = filter(df) do row # create a thunk that operates on rows
    row.type == "σ" && row.group == "batch" # our filtering rule
end
plot(x = σbatch.value, Geom.density,
    Guide.xlabel("Parametric bootstrap estimates of σ₁"))
Parametric bootstrap estimates of σ₁ -300 -250 -200 -150 -100 -50 0 50 100 150 200 250 300 350 400 -250 -200 -150 -100 -50 0 50 100 150 200 250 300 350 -400 -200 0 200 400 -250 -240 -230 -220 -210 -200 -190 -180 -170 -160 -150 -140 -130 -120 -110 -100 -90 -80 -70 -60 -50 -40 -30 -20 -10 0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 210 220 230 240 250 260 270 280 290 300 310 320 330 340 350 h,j,k,l,arrows,drag to pan i,o,+,-,scroll,shift-drag to zoom r,dbl-click to reset c for coordinates ? for help ? -0.04 -0.03 -0.02 -0.01 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 -0.030 -0.028 -0.026 -0.024 -0.022 -0.020 -0.018 -0.016 -0.014 -0.012 -0.010 -0.008 -0.006 -0.004 -0.002 0.000 0.002 0.004 0.006 0.008 0.010 0.012 0.014 0.016 0.018 0.020 0.022 0.024 0.026 0.028 0.030 0.032 0.034 0.036 0.038 0.040 0.042 0.044 0.046 0.048 0.050 0.052 0.054 0.056 0.058 0.060 -0.03 0.00 0.03 0.06 -0.030 -0.029 -0.028 -0.027 -0.026 -0.025 -0.024 -0.023 -0.022 -0.021 -0.020 -0.019 -0.018 -0.017 -0.016 -0.015 -0.014 -0.013 -0.012 -0.011 -0.010 -0.009 -0.008 -0.007 -0.006 -0.005 -0.004 -0.003 -0.002 -0.001 0.000 0.001 0.002 0.003 0.004 0.005 0.006 0.007 0.008 0.009 0.010 0.011 0.012 0.013 0.014 0.015 0.016 0.017 0.018 0.019 0.020 0.021 0.022 0.023 0.024 0.025 0.026 0.027 0.028 0.029 0.030 0.031 0.032 0.033 0.034 0.035 0.036 0.037 0.038 0.039 0.040 0.041 0.042 0.043 0.044 0.045 0.046 0.047 0.048 0.049 0.050 0.051 0.052 0.053 0.054 0.055 0.056 0.057 0.058 0.059 0.060

Notice that this density plot has a spike, or mode, at zero. Although this mode appears to be diffuse, this is an artifact of the way that density plots are created. In fact, it is a pulse, as can be seen from a histogram.

plot(x = σbatch.value, Geom.histogram,
    Guide.xlabel("Parametric bootstrap estimates of σ₁"))
Parametric bootstrap estimates of σ₁ -150 -100 -50 0 50 100 150 200 250 -100 -90 -80 -70 -60 -50 -40 -30 -20 -10 0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 -100 0 100 200 -100 -95 -90 -85 -80 -75 -70 -65 -60 -55 -50 -45 -40 -35 -30 -25 -20 -15 -10 -5 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105 110 115 120 125 130 135 140 145 150 155 160 165 170 175 180 185 190 195 200 h,j,k,l,arrows,drag to pan i,o,+,-,scroll,shift-drag to zoom r,dbl-click to reset c for coordinates ? for help ? -1500 -1000 -500 0 500 1000 1500 2000 2500 -1000 -900 -800 -700 -600 -500 -400 -300 -200 -100 0 100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 2000 -1000 0 1000 2000 -1000 -950 -900 -850 -800 -750 -700 -650 -600 -550 -500 -450 -400 -350 -300 -250 -200 -150 -100 -50 0 50 100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 850 900 950 1000 1050 1100 1150 1200 1250 1300 1350 1400 1450 1500 1550 1600 1650 1700 1750 1800 1850 1900 1950 2000

The bootstrap sample can be used to generate intervals that cover a certain percentage of the bootstrapped values. We refer to these as "coverage intervals", similar to a confidence interval. The shortest such intervals, obtained with the shortestcovint extractor, correspond to a highest posterior density interval in Bayesian inference.

MixedModels.shortestcovintFunction
shortestcovint(v, level = 0.95)

Return the shortest interval containing level proportion of the values of v

source
shortestcovint(bsamp::MixedModelFitCollection, level = 0.95)

Return the shortest interval containing level proportion for each parameter from bsamp.allpars.

Warning

Currently, correlations that are systematically zero are included in the the result. This may change in a future release without being considered a breaking change.

source

We generate these for all random and fixed effects:

combine(groupby(df, [:type, :group, :names]), :value => shortestcovint => :interval)

3 rows × 4 columns

typegroupnamesinterval
StringString?String?Tuple…
1βmissing(Intercept)(1492.54, 1561.34)
2σbatch(Intercept)(0.0, 54.7042)
3σresidualmissing(35.4909, 63.0209)

We can also generate this directly from the original bootstrap object:

DataFrame(shortestcovint(samp))

3 rows × 5 columns

typegroupnameslowerupper
StringString?String?Float64Float64
1βmissing(Intercept)1492.541561.34
2σbatch(Intercept)0.054.7042
3σresidualmissing35.490963.0209

A value of zero for the standard deviation of the random effects is an example of a singular covariance. It is easy to detect the singularity in the case of a scalar random-effects term. However, it is not as straightforward to detect singularity in vector-valued random-effects terms.

For example, if we bootstrap a model fit to the sleepstudy data

sleepstudy = MixedModels.dataset(:sleepstudy)
m2 = fit(
    MixedModel,
    @formula(reaction ~ 1+days+(1+days|subj)),
    sleepstudy,
)
Linear mixed model fit by maximum likelihood
 reaction ~ 1 + days + (1 + days | subj)
   logLik   -2 logLik     AIC       AICc        BIC    
  -875.9697  1751.9393  1763.9393  1764.4249  1783.0971

Variance components:
            Column    Variance Std.Dev.   Corr.
subj     (Intercept)  565.51068 23.78047
         days          32.68212  5.71683 +0.08
Residual              654.94145 25.59182
 Number of obs: 180; levels of grouping factors: 18

  Fixed-effects parameters:
──────────────────────────────────────────────────
                Coef.  Std. Error      z  Pr(>|z|)
──────────────────────────────────────────────────
(Intercept)  251.405      6.63226  37.91    <1e-99
days          10.4673     1.50224   6.97    <1e-11
──────────────────────────────────────────────────
samp2 = parametricbootstrap(rng, 10_000, m2, use_threads=true);
df2 = DataFrame(samp2.allpars);
first(df2, 10)

10 rows × 5 columns

itertypegroupnamesvalue
Int64StringString?String?Float64
11βmissing(Intercept)262.397
21βmissingdays10.5117
31σsubj(Intercept)24.8119
41σsubjdays6.86013
51ρsubj(Intercept), days-0.335778
61σresidualmissing26.2691
72βmissing(Intercept)252.753
82βmissingdays10.9396
92σsubj(Intercept)15.4139
102σsubjdays5.27726

the singularity can be exhibited as a standard deviation of zero or as a correlation of $\pm1$.

DataFrame(shortestcovint(samp2))

6 rows × 5 columns

typegroupnameslowerupper
StringString?String?Float64Float64
1βmissing(Intercept)238.583263.892
2βmissingdays7.5396413.3682
3σsubj(Intercept)10.50833.2184
4σsubjdays3.024937.67527
5ρsubj(Intercept), days-0.4053581.0
6σresidualmissing22.656828.4312

A histogram of the estimated correlations from the bootstrap sample has a spike at +1.

ρs = filter(df2) do row
    row.type == "ρ" && row.group == "subj"
end
plot(x = ρs.value, Geom.histogram,
    Guide.xlabel("Parametric bootstrap samples of correlation of random effects"))
Parametric bootstrap samples of correlation of random effects -3.5 -3.0 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 -3.0 -2.8 -2.6 -2.4 -2.2 -2.0 -1.8 -1.6 -1.4 -1.2 -1.0 -0.8 -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4 2.6 2.8 3.0 -4 -2 0 2 4 -3.0 -2.9 -2.8 -2.7 -2.6 -2.5 -2.4 -2.3 -2.2 -2.1 -2.0 -1.9 -1.8 -1.7 -1.6 -1.5 -1.4 -1.3 -1.2 -1.1 -1.0 -0.9 -0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 3.0 h,j,k,l,arrows,drag to pan i,o,+,-,scroll,shift-drag to zoom r,dbl-click to reset c for coordinates ? for help ? -500 -400 -300 -200 -100 0 100 200 300 400 500 600 700 800 900 -400 -350 -300 -250 -200 -150 -100 -50 0 50 100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 -500 0 500 1000 -400 -380 -360 -340 -320 -300 -280 -260 -240 -220 -200 -180 -160 -140 -120 -100 -80 -60 -40 -20 0 20 40 60 80 100 120 140 160 180 200 220 240 260 280 300 320 340 360 380 400 420 440 460 480 500 520 540 560 580 600 620 640 660 680 700 720 740 760 780 800

or, as a count,

count(ρs.value .≈ 1)
308

Close examination of the histogram shows a few values of -1.

count(ρs.value .≈ -1)
2

Furthermore there are even a few cases where the estimate of the standard deviation of the random effect for the intercept is zero.

σs = filter(df2) do row
    row.type == "σ" && row.group == "subj" && row.names == "(Intercept)"
end
count(σs.value .≈ 0)
5

There is a general condition to check for singularity of an estimated covariance matrix or matrices in a bootstrap sample. The parameter optimized in the estimation is θ, the relative covariance parameter. Some of the elements of this parameter vector must be non-negative and, when one of these components is approximately zero, one of the covariance matrices will be singular.

The issingular method for a MixedModel object that tests if a parameter vector θ corresponds to a boundary or singular fit.

This operation is encapsulated in a method for the issingular function.

count(issingular(samp2))
315