# API

In addition to its own functionality, MixedModels.jl also implements extensive support for the `StatsAPI.StatisticalModel`

and `StatsAPI.RegressionModel`

API.

## Types

`MixedModels.BlockDescription`

— Type`BlockDescription`

Description of blocks of `A`

and `L`

in a `LinearMixedModel`

**Fields**

`blknms`

: Vector{String} of block names`blkrows`

: Vector{Int} of the number of rows in each block`ALtypes`

: Matrix{String} of datatypes for blocks in`A`

and`L`

.

When a block in `L`

is the same type as the corresponding block in `A`

, it is described with a single name, such as `Dense`

. When the types differ the entry in `ALtypes`

is of the form `Diag/Dense`

, as determined by a `shorttype`

method.

`MixedModels.BlockedSparse`

— Type`BlockedSparse{Tv,S,P}`

A `SparseMatrixCSC`

whose nonzeros form blocks of rows or columns or both.

**Members**

`cscmat`

:`SparseMatrixCSC{Tv, Int32}`

representation for general calculations`nzasmat`

: nonzeros of`cscmat`

as a dense matrix`colblkptr`

: pattern of blocks of columns

The only time these are created are as products of `ReMat`

s.

`MixedModels.FeMat`

— Type`FeMat{T,S}`

A matrix and a (possibly) weighted copy of itself.

Typically, an `FeMat`

represents the fixed-effects model matrix with the response (`y`

) concatenated as a final column.

`FeMat`

is not the same as `FeTerm`

.

**Fields**

`xy`

: original matrix, called`xy`

b/c in practice this is`hcat(fullrank(X), y)`

`wtxy`

: (possibly) weighted copy of`xy`

(shares storage with`xy`

until weights are applied)

Upon construction the `xy`

and `wtxy`

fields refer to the same matrix

`MixedModels.FeTerm`

— Type`FeTerm{T,S}`

Term with an explicit, constant matrix representation

Typically, an `FeTerm`

represents the model matrix for the fixed effects.

`FeTerm`

is not the same as `FeMat`

!

**Fields**

`x`

: full model matrix`piv`

: pivot`Vector{Int}`

for moving linearly dependent columns to the right`rank`

: computational rank of`x`

`cnames`

: vector of column names

`MixedModels.FeTerm`

— Method`FeTerm(X::SparseMatrixCSC, cnms)`

Convenience constructor for a sparse `FeTerm`

assuming full rank, identity pivot and unit weights.

Note: automatic rank deficiency handling may be added to this method in the future, as discused in the vignette "Rank deficiency in mixed-effects models" for general `FeTerm`

.

`MixedModels.FeTerm`

— Method`FeTerm(X::AbstractMatrix, cnms)`

Convenience constructor for `FeTerm`

that computes the rank and pivot with unit weights.

See the vignette "Rank deficiency in mixed-effects models" for more information on the computation of the rank and pivot.

`MixedModels.GaussHermiteNormalized`

— Type`GaussHermiteNormalized{K}`

A struct with 2 SVector{K,Float64} members

`z`

: abscissae for the K-point Gauss-Hermite quadrature rule on the Z scale`wt`

: Gauss-Hermite weights normalized to sum to unity

`MixedModels.GeneralizedLinearMixedModel`

— Type`GeneralizedLinearMixedModel`

Generalized linear mixed-effects model representation

**Fields**

`LMM`

: a`LinearMixedModel`

- the local approximation to the GLMM.`β`

: the pivoted and possibly truncated fixed-effects vector`β₀`

: similar to`β`

. Used in the PIRLS algorithm if step-halving is needed.`θ`

: covariance parameter vector`b`

: similar to`u`

, equivalent to`broadcast!(*, b, LMM.Λ, u)`

`u`

: a vector of matrices of random effects`u₀`

: similar to`u`

. Used in the PIRLS algorithm if step-halving is needed.`resp`

: a`GlmResp`

object`η`

: the linear predictor`wt`

: vector of prior case weights, a value of`T[]`

indicates equal weights.

The following fields are used in adaptive Gauss-Hermite quadrature, which applies only to models with a single random-effects term, in which case their lengths are the number of levels in the grouping factor for that term. Otherwise they are zero-length vectors.

`devc`

: vector of deviance components`devc0`

: vector of deviance components at offset of zero`sd`

: approximate standard deviation of the conditional density`mult`

: multiplier

**Properties**

In addition to the fieldnames, the following names are also accessible through the `.`

extractor

`theta`

: synonym for`θ`

`beta`

: synonym for`β`

`σ`

or`sigma`

: common scale parameter (value is`NaN`

for distributions without a scale parameter)`lowerbd`

: vector of lower bounds on the combined elements of`β`

and`θ`

`formula`

,`trms`

,`A`

,`L`

, and`optsum`

: fields of the`LMM`

field`X`

: fixed-effects model matrix`y`

: response vector

`MixedModels.Grouping`

— Type`struct Grouping <: StatsModels.AbstractContrasts end`

A placeholder type to indicate that a categorical variable is only used for grouping and not for contrasts. When creating a `CategoricalTerm`

, this skips constructing the contrasts matrix which makes it robust to large numbers of levels, while still holding onto the vector of levels and constructing the level-to-index mapping (`invindex`

field of the `ContrastsMatrix`

.).

Note that calling `modelcols`

on a `CategoricalTerm{Grouping}`

is an error.

**Examples**

```
julia> schema((; grp = string.(1:100_000)))
# out-of-memory error
julia> schema((; grp = string.(1:100_000)), Dict(:grp => Grouping()))
```

`MixedModels.LikelihoodRatioTest`

— Type`LikelihoodRatioTest`

Results of MixedModels.likelihoodratiotest

**Fields**

`formulas`

: Vector of model formulae`models`

: NamedTuple of the`dof`

and`deviance`

of the models`tests`

: NamedTuple of the sequential`dofdiff`

,`deviancediff`

, and resulting`pvalues`

**Properties**

`deviance`

: note that this is actually -2 log likelihood for linear models (i.e. without subtracting the constant for a saturated model)`pvalues`

`MixedModels.LinearMixedModel`

— Type`LinearMixedModel`

Linear mixed-effects model representation

**Fields**

`formula`

: the formula for the model`reterms`

: a`Vector{AbstractReMat{T}}`

of random-effects terms.`Xymat`

: horizontal concatenation of a full-rank fixed-effects model matrix`X`

and response`y`

as an`FeMat{T}`

`feterm`

: the fixed-effects model matrix as an`FeTerm{T}`

`sqrtwts`

: vector of square roots of the case weights. Can be empty.`parmap`

: Vector{NTuple{3,Int}} of (block, row, column) mapping of θ to λ`dims`

: NamedTuple{(:n, :p, :nretrms),NTuple{3,Int}} of dimensions.`p`

is the rank of`X`

, which may be smaller than`size(X, 2)`

.`A`

: a`Vector{AbstractMatrix}`

containing the row-major packed lower triangle of`hcat(Z,X,y)'hcat(Z,X,y)`

`L`

: the blocked lower Cholesky factor of`Λ'AΛ+I`

in the same Vector representation as`A`

`optsum`

: an`OptSummary`

object

**Properties**

`θ`

or`theta`

: the covariance parameter vector used to form λ`β`

or`beta`

: the fixed-effects coefficient vector`λ`

or`lambda`

: a vector of lower triangular matrices repeated on the diagonal blocks of`Λ`

`σ`

or`sigma`

: current value of the standard deviation of the per-observation noise`b`

: random effects on the original scale, as a vector of matrices`u`

: random effects on the orthogonal scale, as a vector of matrices`lowerbd`

: lower bounds on the elements of θ`X`

: the fixed-effects model matrix`y`

: the response vector

`MixedModels.LinearMixedModel`

— Type`LinearMixedModel(y, Xs, form, wts=[], σ=nothing)`

Private constructor for a LinearMixedModel.

To construct a model, you only need the response (`y`

), already assembled model matrices (`Xs`

), schematized formula (`form`

) and weights (`wts`

). Everything else in the structure can be derived from these quantities.

This method is internal and experimental and so may change or disappear in a future release without being considered a breaking change.

`MixedModels.LinearMixedModel`

— Method`LinearMixedModel(y, feterm, reterms, form, wts=[], σ=nothing)`

Private constructor for a `LinearMixedModel`

given already assembled fixed and random effects.

To construct a model, you only need a vector of `FeMat`

s (the fixed-effects model matrix and response), a vector of `AbstractReMat`

(the random-effects model matrices), the formula and the weights. Everything else in the structure can be derived from these quantities.

This method is internal and experimental and so may change or disappear in a future release without being considered a breaking change.

`MixedModels.MixedModel`

— Type`MixedModel`

Abstract type for mixed models. MixedModels.jl implements two subtypes: `LinearMixedModel`

and `GeneralizedLinearMixedModel`

. See the documentation for each for more details.

This type is primarily used for dispatch in `fit`

. Without a distribution and link function specified, a `LinearMixedModel`

will be fit. When a distribution/link function is provided, a `GeneralizedLinearModel`

is fit, unless that distribution is `Normal`

and the link is `IdentityLink`

, in which case the resulting GLMM would be equivalent to a `LinearMixedModel`

anyway and so the simpler, equivalent `LinearMixedModel`

will be fit instead.

`MixedModels.MixedModelBootstrap`

— Type`MixedModelBootstrap{T<:AbstractFloat} <: MixedModelFitCollection{T}`

Object returned by `parametericbootstrap`

with fields

`fits`

: the parameter estimates from the bootstrap replicates as a vector of named tuples.`λ`

:`Vector{LowerTriangular{T,Matrix{T}}}`

containing copies of the λ field from`ReMat`

model terms`inds`

:`Vector{Vector{Int}}`

containing copies of the`inds`

field from`ReMat`

model terms`lowerbd`

:`Vector{T}`

containing the vector of lower bounds (corresponds to the identically named field of`OptSummary`

)`fcnames`

: NamedTuple whose keys are the grouping factor names and whose values are the column names

The schema of `fits`

is, by default,

```
Tables.Schema:
:objective T
:σ T
:β NamedTuple{β_names}{NTuple{p,T}}
:se StaticArrays.SArray{Tuple{p},T,1,p}
:θ StaticArrays.SArray{Tuple{k},T,1,k}
```

where the sizes, `p`

and `k`

, of the `β`

and `θ`

elements are determined by the model.

Characteristics of the bootstrap replicates can be extracted as properties. The `σs`

and `σρs`

properties unravel the `σ`

and `θ`

estimates into estimates of the standard deviations and correlations of the random-effects terms.

`MixedModels.OptSummary`

— Type`OptSummary`

Summary of an `NLopt`

optimization

**Fields**

`initial`

: a copy of the initial parameter values in the optimization`finitial`

: the initial value of the objective`lowerbd`

: lower bounds on the parameter values`ftol_rel`

: as in NLopt`ftol_abs`

: as in NLopt`xtol_rel`

: as in NLopt`xtol_abs`

: as in NLopt`initial_step`

: as in NLopt`maxfeval`

: as in NLopt (`maxeval`

)`maxtime`

: as in NLopt`final`

: a copy of the final parameter values from the optimization`fmin`

: the final value of the objective`feval`

: the number of function evaluations`optimizer`

: the name of the optimizer used, as a`Symbol`

`returnvalue`

: the return value, as a`Symbol`

`nAGQ`

: number of adaptive Gauss-Hermite quadrature points in deviance evaluation for GLMMs`REML`

: use the REML criterion for LMM fits`sigma`

: a priori value for the residual standard deviation for LMM`fitlog`

: A vector of tuples of parameter and objectives values from steps in the optimization

The latter four fields are MixedModels functionality and not related directly to the `NLopt`

package or algorithms.

The internal storage of the parameter values within `fitlog`

may change in the future to use a different subtype of `AbstractVector`

(e.g., `StaticArrays.SVector`

) for each snapshot without being considered a breaking change.

`MixedModels.PCA`

— Type`PCA{T<:AbstractFloat}`

Principal Components Analysis

**Fields**

`covcorr`

covariance or correlation matrix`sv`

singular value decomposition`rnames`

rownames of the original matrix`corr`

is this a correlation matrix?

`MixedModels.RaggedArray`

— Type`RaggedArray{T,I}`

A "ragged" array structure consisting of values and indices

**Fields**

`vals`

: a`Vector{T}`

containing the values`inds`

: a`Vector{I}`

containing the indices

For this application a `RaggedArray`

is used only in its `sum!`

method.

`MixedModels.ReMat`

— Type`ReMat{T,S} <: AbstractMatrix{T}`

A section of a model matrix generated by a random-effects term.

**Fields**

`trm`

: the grouping factor as a`StatsModels.CategoricalTerm`

`refs`

: indices into the levels of the grouping factor as a`Vector{Int32}`

`levels`

: the levels of the grouping factor`cnames`

: the names of the columns of the model matrix generated by the left-hand side of the term`z`

: transpose of the model matrix generated by the left-hand side of the term`wtz`

: a weighted copy of`z`

(`z`

and`wtz`

are the same object for unweighted cases)`λ`

: a`LowerTriangular`

matrix of size`S×S`

`inds`

: a`Vector{Int}`

of linear indices of the potential nonzeros in`λ`

`adjA`

: the adjoint of the matrix as a`SparseMatrixCSC{T}`

`scratch`

: a`Matrix{T}`

`MixedModels.UniformBlockDiagonal`

— Type`UniformBlockDiagonal{T}`

Homogeneous block diagonal matrices. `k`

diagonal blocks each of size `m×m`

`MixedModels.VarCorr`

— Type`VarCorr`

Information from the fitted random-effects variance-covariance matrices.

**Members**

`σρ`

: a`NamedTuple`

of`NamedTuple`

s as returned from`σρs`

`s`

: the estimate of the per-observation dispersion parameter

The main purpose of defining this type is to isolate the logic in the show method.

## Exported Functions

`LinearAlgebra.cond`

— Methodcond(m::MixedModel)

Return a vector of condition numbers of the λ matrices for the random-effects terms

`LinearAlgebra.logdet`

— Method`logdet(m::LinearMixedModel)`

Return the value of `log(det(Λ'Z'ZΛ + I)) + m.optsum.REML * log(det(LX*LX'))`

evaluated in place.

Here LX is the diagonal term corresponding to the fixed-effects in the blocked lower Cholesky factor.

`MixedModels.GHnorm`

— Method`GHnorm(k::Int)`

Return the (unique) GaussHermiteNormalized{k} object.

The function values are stored (memoized) when first evaluated. Subsequent evaluations for the same `k`

have very low overhead.

`MixedModels.coefpvalues`

— Method`coefpvalues(bsamp::MixedModelFitCollection)`

Return a rowtable with columns `(:iter, :coefname, :β, :se, :z, :p)`

`MixedModels.condVar`

— Method`condVar(m::LinearMixedModel)`

Return the conditional variances matrices of the random effects.

The random effects are returned by `ranef`

as a vector of length `k`

, where `k`

is the number of random effects terms. The `i`

th element is a matrix of size `vᵢ × ℓᵢ`

where `vᵢ`

is the size of the vector-valued random effects for each of the `ℓᵢ`

levels of the grouping factor. Technically those values are the modes of the conditional distribution of the random effects given the observed data.

This function returns an array of `k`

three dimensional arrays, where the `i`

th array is of size `vᵢ × vᵢ × ℓᵢ`

. These are the diagonal blocks from the conditional variance-covariance matrix,

`s² Λ(Λ'Z'ZΛ + I)⁻¹Λ'`

`MixedModels.condVartables`

— Method`condVartables(m::LinearMixedModel)`

Return the conditional covariance matrices of the random effects as a `NamedTuple`

of columntables

`MixedModels.fitted!`

— Method`fitted!(v::AbstractArray{T}, m::LinearMixedModel{T})`

Overwrite `v`

with the fitted values from `m`

.

See also `fitted`

.

`MixedModels.fixef`

— Method`fixef(m::MixedModel)`

Return the fixed-effects parameter vector estimate of `m`

.

In the rank-deficient case the truncated parameter vector, of length `rank(m)`

is returned. This is unlike `coef`

which always returns a vector whose length matches the number of columns in `X`

.

`MixedModels.fixefnames`

— Method`fixefnames(m::MixedModel)`

Return a (permuted and truncated in the rank-deficient case) vector of coefficient names.

`MixedModels.fnames`

— Method`fnames(m::MixedModel)`

Return the names of the grouping factors for the random-effects terms.

`MixedModels.fulldummy`

— Method`fulldummy(term::CategoricalTerm)`

Assign "contrasts" that include all indicator columns (dummy variables) and an intercept column.

This will result in an under-determined set of contrasts, which is not a problem in the random effects because of the regularization, or "shrinkage", of the conditional modes.

The interaction of `fulldummy`

with complex random effects is subtle and complex with numerous potential edge cases. As we discover these edge cases, we will document and determine their behavior. Until such time, please check the model summary to verify that the expansion is working as you expected. If it is not, please report a use case on GitHub.

`MixedModels.issingular`

— Function`issingular(m::MixedModel, θ=m.θ)`

Test whether the model `m`

is singular if the parameter vector is `θ`

.

Equality comparisons are used b/c small non-negative θ values are replaced by 0 in `fit!`

.

For `GeneralizedLinearMixedModel`

, the entire parameter vector (including β in the case `fast=false`

) must be specified if the default is not used.

`MixedModels.issingular`

— Method`issingular(bsamp::MixedModelFitCollection)`

Test each bootstrap sample for singularity of the corresponding fit.

Equality comparisons are used b/c small non-negative θ values are replaced by 0 in `fit!`

.

See also `issingular(::MixedModel)`

.

`MixedModels.lowerbd`

— Method`lowerbd{T}(A::ReMat{T})`

Return the vector of lower bounds on the parameters, `θ`

associated with `A`

These are the elements in the lower triangle of `A.λ`

in column-major ordering. Diagonals have a lower bound of `0`

. Off-diagonals have a lower-bound of `-Inf`

.

`MixedModels.objective`

— Method`objective(m::LinearMixedModel)`

Return negative twice the log-likelihood of model `m`

`MixedModels.parametricbootstrap`

— Method```
parametricbootstrap([rng::AbstractRNG], nsamp::Integer, m::MixedModel{T}, ftype=T;
β = coef(m), σ = m.σ, θ = m.θ, use_threads=false, hide_progress=false)
```

Perform `nsamp`

parametric bootstrap replication fits of `m`

, returning a `MixedModelBootstrap`

.

The default random number generator is `Random.GLOBAL_RNG`

.

`ftype`

can be used to store the computed bootstrap values in a lower precision. `ftype`

is not a named argument because named arguments are not used in method dispatch and thus specialization. In other words, having `ftype`

as a positional argument has some potential performance benefits.

**Named Arguments**

`β`

,`σ`

, and`θ`

are the values of`m`

's parameters for simulating the responses.`σ`

is only valid for`LinearMixedModel`

and`GeneralizedLinearMixedModel`

for

families with a dispersion parameter.

`use_threads`

determines whether or not to use thread-based parallelism.`hide_progress`

can be used to disable the progress bar. Note that the progress

bar is automatically disabled for non-interactive (i.e. logging) contexts.

Note that `use_threads=true`

may not offer a performance boost and may even decrease peformance if multithreaded linear algebra (BLAS) routines are available. In this case, threads at the level of the linear algebra may already occupy all processors/processor cores. There are plans to provide better support in coordinating Julia- and BLAS-level threads in the future.

The PRNG shared between threads is locked using `Threads.SpinLock`

, which should not be used recursively. Do not wrap `parametricbootstrap`

in an outer `SpinLock`

.

`MixedModels.pirls!`

— Method`pirls!(m::GeneralizedLinearMixedModel)`

Use Penalized Iteratively Reweighted Least Squares (PIRLS) to determine the conditional modes of the random effects.

When `varyβ`

is true both `u`

and `β`

are optimized with PIRLS. Otherwise only `u`

is optimized and `β`

is held fixed.

Passing `verbose = true`

provides verbose output of the iterations.

`MixedModels.pwrss`

— Method`pwrss(m::LinearMixedModel)`

The penalized, weighted residual sum-of-squares.

`MixedModels.ranef`

— Method`ranef(m::LinearMixedModel; uscale=false)`

Return, as a `Vector{Matrix{T}}`

, the conditional modes of the random effects in model `m`

.

If `uscale`

is `true`

the random effects are on the spherical (i.e. `u`

) scale, otherwise on the original scale.

For a named variant, see `raneftables`

.

`MixedModels.raneftables`

— Method`raneftables(m::MixedModel; uscale = false)`

Return the conditional means of the random effects as a `NamedTuple`

of columntables

`MixedModels.refit!`

— Method```
refit!(m::GeneralizedLinearMixedModel[, y::Vector];
fast::Bool = (length(m.θ) == length(m.optsum.final)),
nAGQ::Integer = m.optsum.nAGQ,
kwargs...)
```

Refit the model `m`

after installing response `y`

.

If `y`

is omitted the current response vector is used.

If not specified, the `fast`

and `nAGQ`

options from the previous fit are used. `kwargs`

are the same as `fit!`

`MixedModels.refit!`

— Method`refit!(m::LinearMixedModel[, y::Vector]; REML=m.optsum.REML, kwargs...)`

Refit the model `m`

after installing response `y`

.

If `y`

is omitted the current response vector is used. `kwargs`

are the same as `fit!`

.

`MixedModels.replicate`

— Method`replicate(f::Function, n::Integer; use_threads=false)`

Return a vector of the values of `n`

calls to `f()`

- used in simulations where the value of `f`

is stochastic.

`hide_progress`

can be used to disable the progress bar. Note that the progress bar is automatically disabled for non-interactive (i.e. logging) contexts.

If `f()`

is not thread-safe or depends on a non thread-safe RNG, then you must set `use_threads=false`

. Also note that ordering of replications is not guaranteed when `use_threads=true`

, although the replications are not otherwise affected for thread-safe `f()`

.

`MixedModels.restoreoptsum!`

— Method```
restoreoptsum!(m::LinearMixedModel, io::IO)
restoreoptsum!(m::LinearMixedModel, filename)
```

Read, check, and restore the `optsum`

field from a JSON stream or filename.

`MixedModels.saveoptsum`

— Method```
saveoptsum(io::IO, m::LinearMixedModel)
saveoptsum(filename, m::LinearMixedModel)
```

Save `m.optsum`

(w/o the `lowerbd`

field) in JSON format to an IO stream or a file

The reason for omitting the `lowerbd`

field is because it often contains `-Inf`

values that are not allowed in JSON.

`MixedModels.sdest`

— Method`sdest(m::LinearMixedModel)`

Return the estimate of σ, the standard deviation of the per-observation noise.

`MixedModels.sdest`

— Method`sdest(m::GeneralizedLinearMixedModel)`

Return the estimate of the dispersion, i.e. the standard deviation of the per-observation noise.

For models with a dispersion parameter ϕ, this is simply ϕ. For models without a dispersion parameter, this value is `missing`

. This differs from `disperion`

, which returns `1`

for models without a dispersion parameter.

For Gaussian models, this parameter is often called σ.

`MixedModels.setθ!`

— Method`setθ!(bsamp::MixedModelFitCollection, i::Integer)`

Install the values of the i'th θ value of `bsamp.fits`

in `bsamp.λ`

`MixedModels.setθ!`

— Method`setθ!(m::LinearMixedModel, v)`

Install `v`

as the θ parameters in `m`

.

`MixedModels.shortestcovint`

— Function`shortestcovint(v, level = 0.95)`

Return the shortest interval containing `level`

proportion of the values of `v`

`MixedModels.shortestcovint`

— Method`shortestcovint(bsamp::MixedModelFitCollection, level = 0.95)`

Return the shortest interval containing `level`

proportion for each parameter from `bsamp.allpars`

.

Currently, correlations that are systematically zero are included in the the result. This may change in a future release without being considered a breaking change.

`MixedModels.simulate`

— FunctionSee `simulate!`

`MixedModels.simulate!`

— Method```
simulate!([rng::AbstractRNG,] y::AbstractVector, m::MixedModel{T}[, newdata];
β = coef(m), σ = m.σ, θ = T[], wts=m.wts)
simulate([rng::AbstractRNG,] m::MixedModel{T}[, newdata];
β = coef(m), σ = m.σ, θ = T[], wts=m.wts)
```

Simulate a new response vector, optionally overwriting a pre-allocated vector.

New data can be optionally provided in tabular format.

This simulation includes sampling new values for the random effects. Thus in contrast to `predict`

, there is no distinction in between "new" and "old" / previously observed random-effects levels.

Unlike `predict`

, there is no `type`

parameter for `GeneralizedLinearMixedModel`

because the noise term in the model and simulation is always on the response scale.

The `wts`

argument is currently ignored except for `GeneralizedLinearMixedModel`

models with a `Binomial`

distribution.

Models are assumed to be full rank.

Note that `simulate!`

methods with a `y::AbstractVector`

as the first argument (besides the RNG) and `simulate`

methods return the simulated response. This is in contrast to `simulate!`

methods with a `m::MixedModel`

as the first argument, which modify the model's response and return the entire modified model.

`MixedModels.simulate!`

— Method```
simulate!(rng::AbstractRNG, m::MixedModel{T}; β=m.β, σ=m.σ, θ=T[])
simulate!(m::MixedModel; β=m.β, σ=m.σ, θ=m.θ)
```

Overwrite the response (i.e. `m.trms[end]`

) with a simulated response vector from model `m`

.

This simulation includes sampling new values for the random effects.

Note that `simulate!`

methods with a `y::AbstractVector`

as the first argument (besides the RNG) and `simulate`

methods return the simulated response. This is in contrast to `simulate!`

methods with a `m::MixedModel`

as the first argument, which modify the model's response and return the entire modified model.

`MixedModels.sparseL`

— Method`sparseL(m::LinearMixedModel; fname::Symbol=first(fnames(m)), full::Bool=false)`

Return the lower Cholesky factor `L`

as a `SparseMatrix{T,Int32}`

.

`full`

indicates whether the parts of `L`

associated with the fixed-effects and response are to be included.

`fname`

specifies the first grouping factor to include. Blocks to the left of the block corresponding to `fname`

are dropped. The default is the first, i.e., leftmost block and hence all blocks.

`MixedModels.stderror!`

— Method`stderror!(v::AbstractVector, m::LinearMixedModel)`

Overwrite `v`

with the standard errors of the fixed-effects coefficients in `m`

The length of `v`

should be the total number of coefficients (i.e. `length(coef(m))`

). When the model matrix is rank-deficient the coefficients forced to `-0.0`

have an undefined (i.e. `NaN`

) standard error.

`MixedModels.updateL!`

— Method`updateL!(m::LinearMixedModel)`

Update the blocked lower Cholesky factor, `m.L`

, from `m.A`

and `m.reterms`

(used for λ only)

This is the crucial step in evaluating the objective, given a new parameter value.

`MixedModels.varest`

— Method`varest(m::LinearMixedModel)`

Returns the estimate of σ², the variance of the conditional distribution of Y given B.

`MixedModels.varest`

— Method`varest(m::GeneralizedLinearMixedModel)`

Returns the estimate of ϕ², the variance of the conditional distribution of Y given B.

For models with a dispersion parameter ϕ, this is simply ϕ². For models without a dispersion parameter, this value is `missing`

. This differs from `disperion`

, which returns `1`

for models without a dispersion parameter.

For Gaussian models, this parameter is often called σ².

`MixedModels.zerocorr`

— Method`zerocorr(term::RandomEffectsTerm)`

Remove correlations between random effects in `term`

.

`Statistics.std`

— Method`std(m::MixedModel)`

Return the estimated standard deviations of the random effects as a `Vector{Vector{T}}`

.

FIXME: This uses an old convention of isfinite(sdest(m)). Probably drop in favor of m.σs

`StatsAPI.deviance`

— Method`deviance(m::GeneralizedLinearMixedModel{T}, nAGQ=1)::T where {T}`

Return the deviance of `m`

evaluated by the Laplace approximation (`nAGQ=1`

) or `nAGQ`

-point adaptive Gauss-Hermite quadrature.

If the distribution `D`

does not have a scale parameter the Laplace approximation is the squared length of the conditional modes, $u$, plus the determinant of $Λ'Z'WZΛ + I$, plus the sum of the squared deviance residuals.

`StatsAPI.dof_residual`

— Method`dof_residual(m::MixedModel)`

Return the residual degrees of freedom of the model.

The residual degrees of freedom for mixed-effects models is not clearly defined due to partial pooling. The classical `nobs(m) - dof(m)`

fails to capture the extra freedom granted by the random effects, but `nobs(m) - nranef(m)`

would overestimate the freedom granted by the random effects. `nobs(m) - sum(leverage(m))`

provides a nice balance based on the relative influence of each observation, but is computationally expensive for large models. This problem is also fundamentally related to long-standing debates about the appropriate treatment of the denominator degrees of freedom for $F$-tests. In the future, MixedModels.jl may provide additional methods allowing the user to choose the computation to use.

Currently, the residual degrees of freedom is computed as `nobs(m) - dof(m)`

, but this may change in the future without being considered a breaking change because there is no canonical definition of the residual degrees of freedom in a mixed-effects model.

`StatsAPI.fit!`

— Method```
fit!(m::GeneralizedLinearMixedModel; fast=false, nAGQ=1,
verbose=false, progress=true,
thin::Int=1,
init_from_lmm=Set())
```

Optimize the objective function for `m`

.

When `fast`

is `true`

a potentially much faster but slightly less accurate algorithm, in which `pirls!`

optimizes both the random effects and the fixed-effects parameters, is used.

If `progress`

is `true`

, the default, a `ProgressMeter.ProgressUnknown`

counter is displayed. during the iterations to minimize the deviance. There is a delay before this display is initialized and it may not be shown at all for models that are optimized quickly.

If `verbose`

is `true`

, then both the intermediate results of both the nonlinear optimization and PIRLS are also displayed on standard output.

At every `thin`

th iteration is recorded in `fitlog`

, optimization progress is saved in `m.optsum.fitlog`

.

By default, the starting values for model fitting are taken from a (non mixed, i.e. marginal ) GLM fit. Experience with larger datasets (many thousands of observations and/or hundreds of levels of the grouping variables) has suggested that fitting a (Gaussian) linear mixed model on the untransformed data may provide better starting values and thus overall faster fits even though an entire LMM must be fit before the GLMM can be fit. `init_from_lmm`

can be used to specify which starting values from an LMM to use. Valid options are any collection (array, set, etc.) containing one or more of `:β`

and `:θ`

, the default is the empty set.

Initializing from an LMM requires fitting the entire LMM first, so when `progress=true`

, there will be two progress bars: first for the LMM, then for the GLMM.

The `init_from_lmm`

functionality is experimental and may change or be removed entirely without being considered a breaking change.

`StatsAPI.fit!`

— Method```
fit!(m::LinearMixedModel; progress::Bool=true, REML::Bool=false,
σ::Union{Real, Nothing}=nothing,
thin::Int=typemax(Int))
```

Optimize the objective of a `LinearMixedModel`

. When `progress`

is `true`

a `ProgressMeter.ProgressUnknown`

display is shown during the optimization of the objective, if the optimization takes more than one second or so.

At every `thin`

th iteration is recorded in `fitlog`

, optimization progress is saved in `m.optsum.fitlog`

.

`StatsAPI.leverage`

— Method`leverage(::LinearMixedModel)`

Return the diagonal of the hat matrix of the model.

For a linear model, the sum of the leverage values is the degrees of freedom for the model in the sense that this sum is the dimension of the span of columns of the model matrix. With a bit of hand waving a similar argument could be made for linear mixed-effects models. The hat matrix is of the form $[ZΛ X][L L']⁻¹[ZΛ X]'$.

`StatsAPI.modelmatrix`

— Method`modelmatrix(m::MixedModel)`

Returns the model matrix `X`

for the fixed-effects parameters, as returned by `coef`

.

This is always the full model matrix in the original column order and from a field in the model struct. It should be copied if it is to be modified.

`StatsAPI.predict`

— Method```
StatsAPI.predict(m::LinearMixedModel, newdata;
new_re_levels=:missing)
StatsAPI.predict(m::GeneralizedLinearMixedModel, newdata;
new_re_levels=:missing, type=:response)
```

Predict response for new data.

Currently, no in-place methods are provided because these methods internally construct a new model and therefore allocate not just a response vector but also many other matrices.

`newdata`

should contain a column for the response (dependent variable) initialized to some numerical value (not `missing`

), because this is used to construct the new model used in computing the predictions. `missing`

is not valid because `missing`

data are dropped before constructing the model matrices.

Models are assumed to be full rank.

These methods construct an entire MixedModel behind the scenes and as such may use a large amount of memory when `newdata`

is large.

The keyword argument `new_re_levels`

specifies how previously unobserved values of the grouping variable are handled. Possible values are:

`:population`

: return population values for the relevant grouping variable. In other words, treat the associated random effect as 0. If all grouping variables have new levels, then this is equivalent to just the fixed effects.`:missing`

: return`missing`

.`:error`

: error on this condition. The error type is an implementation detail: you should not rely on a particular type of error being thrown.

If you want simulated values for unobserved levels of the grouping variable, consider the `simulate!`

and `simulate`

methods.

Predictions based purely on the fixed effects can be obtained by specifying previously unobserved levels of the random effects and setting `new_re_levels=:population`

. Similarly, the contribution of any grouping variable can be excluded by specifying previously unobserved levels, while including previously observed levels of the other grouping variables. In the future, it may be possible to specify a subset of the grouping variables or overall random-effects structure to use, but not at this time.

`new_re_levels`

impacts only the behavior for previously unobserved random effects levels, i.e. new RE levels. For previously observed random effects levels, predictions take both the fixed and random effects into account.

For `GeneralizedLinearMixedModel`

, the `type`

parameter specifies whether the predictions should be returned on the scale of linear predictor (`:linpred`

) or on the response scale (`:response`

). If you don't know the difference between these terms, then you probably want `type=:response`

.

Regression weights are not yet supported in prediction. Similarly, offsets are also not supported for `GeneralizedLinearMixedModel`

.

`StatsAPI.response`

— Method`response(m::MixedModel)`

Return the response vector for the model.

For a linear mixed model this is a `view`

of the last column of the `XyMat`

field. For a generalized linear mixed model this is the `m.resp.y`

field. In either case it should be copied if it is to be modified.

`StatsAPI.vcov`

— Method`vcov(m::MixedModel; corr=false)`

Returns the variance-covariance matrix of the fixed effects. If `corr`

is `true`

, the correlation of the fixed effects is returned instead.

`Tables.columntable`

— Method`columntable(s::OptSummary, [stack::Bool=false])`

Return `s.fitlog`

as a `Tables.columntable`

.

When `stack`

is false (the default), there will be 3 columns in the result:

`iter`

: the sample number`objective`

: the value of the objective at that sample`θ`

: the parameter vector at that sample

(The term `sample`

here refers to the fact that when the `thin`

argument to the `fit`

or `refit!`

call is greater than 1 only a subset of the iterations have results recorded.)

When `stack`

is true, there will be 4 columns: `iter`

, `objective`

, `par`

, and `value`

where `value`

is the stacked contents of the `θ`

vectors (the equivalent of `vcat(θ...)`

) and `par`

is a vector of parameter numbers.

## Methods from `StatsAPI.jl`

, `StatsBase.jl`

, `StatsModels.jl`

and `GLM.jl`

```
aic
aicc
bic
coef
coefnames
coeftable
deviance
dispersion
dispersion_parameter
dof
dof_residual
fit
fit!
fitted
formula
isfitted
islinear
leverage
loglikelihood
meanresponse
modelmatrix
model_response
nobs
predict
residuals
response
responsename
StatsModels.lrtest # not exported
std
stderror
vcov
weights
```

### MixedModels.jl "alternatives" and extensions to StatsAPI and GLM functions

The following are MixedModels.jl-specific functions and not simply methods for functions defined in StatsAPI and GLM.jl.

```
coefpvalues
condVar
condVarTables
fitted!
fixef
fixefnames
likelihoodratiotest # not exported
pwrss
ranef
raneftables
refit!
shortestcovint
sdest
simulate
simulate!
stderrror!
varest
```

## Non-Exported Functions

Note that unless discussed elsewhere in the online documentation, non-exported functions should be considered implementation details.

`Base.size`

— Method`size(m::MixedModel)`

Returns the size of a mixed model as a tuple of length four: the number of observations, the number of (non-singular) fixed-effects parameters, the number of conditional modes (random effects), the number of grouping variables

`GLM.wrkresp!`

— Method`GLM.wrkresp!(v::AbstractVector{T}, resp::GLM.GlmResp{AbstractVector{T}})`

A copy of a method from GLM that generalizes the types in the signature

`MixedModels.LD`

— Method```
LD(A::Diagonal)
LD(A::HBlikDiag)
LD(A::DenseMatrix)
```

Return `log(det(tril(A)))`

evaluated in place.

`MixedModels.adjA`

— Method`adjA(refs::AbstractVector, z::AbstractMatrix{T})`

Returns the adjoint of an `ReMat`

as a `SparseMatrixCSC{T,Int32}`

`MixedModels.allpars`

— Method`allpars(bsamp::MixedModelFitCollection)`

Return a tidy (column)table with the parameter estimates spread into columns of `iter`

, `type`

, `group`

, `name`

and `value`

.

Currently, correlations that are systematically zero are included in the the result. This may change in a future release without being considered a breaking change.

`MixedModels.amalgamate`

— Method`amalgamate(reterms::Vector{AbstractReMat})`

Combine multiple ReMat with the same grouping variable into a single object.

`MixedModels.average`

— Method`average(a::T, b::T) where {T<:AbstractFloat}`

Return the average of `a`

and `b`

`MixedModels.block`

— Method`block(i, j)`

Return the linear index of the `[i,j]`

position ("block") in the row-major packed lower triangle.

Use the row-major ordering in this case because the result depends only on `i`

and `j`

, not on the overall size of the array.

When `i == j`

the value is the same as `kp1choose2(i)`

.

`MixedModels.cholUnblocked!`

— Function`cholUnblocked!(A, Val{:L})`

Overwrite the lower triangle of `A`

with its lower Cholesky factor.

The name is borrowed from [https://github.com/andreasnoack/LinearAlgebra.jl] because these are part of the inner calculations in a blocked Cholesky factorization.

`MixedModels.copyscaleinflate!`

— Function`copyscaleinflate!(L::AbstractMatrix, A::AbstractMatrix, Λ::ReMat)`

Overwrite L with `Λ'AΛ + I`

`MixedModels.corrmat`

— Method`corrmat(A::ReMat)`

Return the estimated correlation matrix for `A`

. The diagonal elements are 1 and the off-diagonal elements are the correlations between those random effect terms

**Example**

Note that trailing digits may vary slightly depending on the local platform.

```
julia> using MixedModels
julia> mod = fit(MixedModel,
@formula(rt_trunc ~ 1 + spkr + prec + load + (1 + spkr + prec | subj)),
MixedModels.dataset(:kb07));
julia> VarCorr(mod)
Variance components:
Column Variance Std.Dev. Corr.
subj (Intercept) 136591.782 369.583
spkr: old 22922.871 151.403 +0.21
prec: maintain 32348.269 179.856 -0.98 -0.03
Residual 642324.531 801.452
julia> MixedModels.corrmat(mod.reterms[1])
3×3 LinearAlgebra.Symmetric{Float64,Array{Float64,2}}:
1.0 0.214816 -0.982948
0.214816 1.0 -0.0315607
-0.982948 -0.0315607 1.0
```

`MixedModels.cpad`

— Method`cpad(s::AbstractString, n::Integer)`

Return a string of length `n`

containing `s`

in the center (more-or-less).

`MixedModels.dataset`

— Method`dataset(nm)`

Return, as an `Arrow.Table`

, the test data set named `nm`

, which can be a `String`

or `Symbol`

`MixedModels.datasets`

— Method`datasets()`

Return a vector of names of the available test data sets

`MixedModels.densify`

— Functiondensify(S::SparseMatrix, threshold=0.1)

Convert sparse `S`

to `Diagonal`

if `S`

is diagonal or to `Array(S)`

if the proportion of nonzeros exceeds `threshold`

.

`MixedModels.deviance!`

— Function`deviance!(m::GeneralizedLinearMixedModel, nAGQ=1)`

Update `m.η`

, `m.μ`

, etc., install the working response and working weights in `m.LMM`

, update `m.LMM.A`

and `m.LMM.R`

, then evaluate the `deviance`

.

`MixedModels.feL`

— Method`feL(m::LinearMixedModel)`

Return the lower Cholesky factor for the fixed-effects parameters, as an `LowerTriangular`

`p × p`

matrix.

`MixedModels.fixef!`

— Method`fixef!(v::Vector{T}, m::MixedModel{T})`

Overwrite `v`

with the pivoted fixed-effects coefficients of model `m`

For full-rank models the length of `v`

must be the rank of `X`

. For rank-deficient models the length of `v`

can be the rank of `X`

or the number of columns of `X`

. In the latter case the calculated coefficients are padded with -0.0 out to the number of columns.

`MixedModels.fname`

— Method`fname(A::ReMat)`

Return the name of the grouping factor as a `Symbol`

`MixedModels.getθ!`

— Method`getθ!(v::AbstractVector{T}, A::ReMat{T}) where {T}`

Overwrite `v`

with the elements of the blocks in the lower triangle of `A.Λ`

(column-major ordering)

`MixedModels.getθ`

— Method`getθ(m::LinearMixedModel)`

Return the current covariance parameter vector.

`MixedModels.indmat`

— Function`indmat(A::ReMat)`

Return a `Bool`

indicator matrix of the potential non-zeros in `A.λ`

`MixedModels.isconstant`

— Method```
isconstant(x::Array)
isconstant(x::Tuple)
```

Are all elements of the iterator the same? That is, is it constant?

`MixedModels.isfullrank`

— Method`isfullrank(A::FeTerm)`

Does `A`

have full column rank?

`MixedModels.isnested`

— Method`isnested(A::ReMat, B::ReMat)`

Is the grouping factor for `A`

nested in the grouping factor for `B`

?

That is, does each value of `A`

occur with just one value of B?

`MixedModels.kchoose2`

— Method`kchoose2(k)`

The binomial coefficient `k`

choose `2`

which is the number of elements in the packed form of the strict lower triangle of a matrix.

`MixedModels.kp1choose2`

— Method`kp1choose2(k)`

The binomial coefficient `k+1`

choose `2`

which is the number of elements in the packed form of the lower triangle of a matrix.

`MixedModels.likelihoodratiotest`

— Method```
likelihoodratiotest(m::MixedModel...)
likelihoodratiotest(m0::LinearModel, m::MixedModel...)
likelihoodratiotest(m0::GeneralizedLinearModel, m::MixedModel...)
likelihoodratiotest(m0::TableRegressionModel{LinearModel}, m::MixedModel...)
likelihoodratiotest(m0::TableRegressionModel{GeneralizedLinearModel}, m::MixedModel...)
```

Likeihood ratio test applied to a set of nested models.

The nesting of the models is not checked. It is incumbent on the user to check this. This differs from `StatsModels.lrtest`

as nesting in mixed models, especially in the random effects specification, may be non obvious.

For comparisons between mixed and non-mixed models, the deviance for the non-mixed model is taken to be -2 log likelihood, i.e. omitting the additive constant for the fully saturated model. This is in line with the computation of the deviance for mixed models.

This functionality may be deprecated in the future in favor of `StatsModels.lrtest`

.

`MixedModels.nranef`

— Method`nranef(A::ReMat)`

Return the number of random effects represented by `A`

. Zero unless `A`

is an `ReMat`

.

`MixedModels.nθ`

— Method`nθ(A::ReMat)`

Return the number of free parameters in the relative covariance matrix λ

`MixedModels.ranef!`

— Method`ranef!(v::Vector{Matrix{T}}, m::MixedModel{T}, β, uscale::Bool) where {T}`

Overwrite `v`

with the conditional modes of the random effects for `m`

.

If `uscale`

is `true`

the random effects are on the spherical (i.e. `u`

) scale, otherwise on the original scale

`MixedModels.rankUpdate!`

— Function```
rankUpdate!(C, A)
rankUpdate!(C, A, α)
rankUpdate!(C, A, α, β)
```

A rank-k update, C := α*A'A + β*C, of a Hermitian (Symmetric) matrix.

`α`

and `β`

both default to 1.0. When `α`

is -1.0 this is a downdate operation. The name `rankUpdate!`

is borrowed from [https://github.com/andreasnoack/LinearAlgebra.jl]

`MixedModels.rePCA`

— Method`rePCA(m::LinearMixedModel; corr::Bool=true)`

Return a named tuple of the normalized cumulative variance of a principal components analysis of the random effects covariance matrices or correlation matrices when `corr`

is `true`

.

The normalized cumulative variance is the proportion of the variance for the first principal component, the first two principal components, etc. The last element is always 1.0 representing the complete proportion of the variance.

`MixedModels.reevaluateAend!`

— Method`reevaluateAend!(m::LinearMixedModel)`

Reevaluate the last column of `m.A`

from `m.Xymat`

. This function should be called after updating the response.

`MixedModels.sdcorr`

— Method`sdcorr(A::AbstractMatrix{T}) where {T}`

Transform a square matrix `A`

with positive diagonals into an `NTuple{size(A,1), T}`

of standard deviations and a tuple of correlations.

`A`

is assumed to be symmetric and only the lower triangle is used. The order of the correlations is row-major ordering of the lower triangle (or, equivalently, column-major in the upper triangle).

`MixedModels.setβθ!`

— Method`setβθ!(m::GeneralizedLinearMixedModel, v)`

Set the parameter vector, `:βθ`

, of `m`

to `v`

.

`βθ`

is the concatenation of the fixed-effects, `β`

, and the covariance parameter, `θ`

.

`MixedModels.ssqdenom`

— Method`ssqdenom(m::LinearMixedModel)`

Return the denominator for penalized sums-of-squares.

For MLE, this value is the number of observations. For REML, this value is the number of observations minus the rank of the fixed-effects matrix. The difference is analagous to the use of n or n-1 in the denominator when calculating the variance.

`MixedModels.statsrank`

— Method`statsrank(x::Matrix{T}, ranktol::Real=1e-8) where {T<:AbstractFloat}`

Return the numerical column rank and a pivot vector.

The rank is determined from the absolute values of the diagonal of R from a pivoted QR decomposition, relative to the first (and, hence, largest) element of this vector.

In the full-rank case the pivot vector is `collect(axes(x, 2))`

.

`MixedModels.tidyβ`

— Method`tidyβ(bsamp::MixedModelFitCollection)`

Return a tidy (row)table with the parameter estimates spread into columns of `iter`

, `coefname`

and `β`

`MixedModels.tidyσs`

— Method`tidyσs(bsamp::MixedModelFitCollection)`

Return a tidy (row)table with the estimates of the variance components (on the standard deviation scale) spread into columns of `iter`

, `group`

, `column`

and `σ`

.

`MixedModels.unfit!`

— Method`unfit!(model::MixedModel)`

Mark a model as unfitted.

`MixedModels.unscaledre!`

— Function```
unscaledre!(y::AbstractVector{T}, M::ReMat{T}) where {T}
unscaledre!(rng::AbstractRNG, y::AbstractVector{T}, M::ReMat{T}) where {T}
```

Add unscaled random effects simulated from `M`

to `y`

.

These are unscaled random effects (i.e. they incorporate λ but not σ) because the scaling is done after the per-observation noise is added as a standard normal.

`MixedModels.updateA!`

— Method`updateA!(m::LinearMixedModel)`

Update the cross-product array, `m.A`

, from `m.reterms`

and `m.Xymat`

This is usually done after a reweight! operation.

`MixedModels.updateη!`

— Method`updateη!(m::GeneralizedLinearMixedModel)`

Update the linear predictor, `m.η`

, from the offset and the `B`

-scale random effects.

`StatsModels.isnested`

— Method`isnested(m1::MixedModel, m2::MixedModel; atol::Real=0.0)`

Indicate whether model `m1`

is nested in model `m2`

, i.e. whether `m1`

can be obtained by constraining some parameters in `m2`

. Both models must have been fitted on the same data. This check is conservative for `MixedModel`

s and may reject nested models with different parameterizations as being non nested.