Gradient and Hessian computation
Experimental support for computing the gradient and the Hessian of the objective function (i.e. negative twice the profiled log likelihood) via ForwardDiff.jl and FiniteDiff.jl are provided as package extensions.
via ForwardDiff.jl
The core functionality is provided by defining appropriate methods for ForwardDiff.gradient
and ForwardDiff.hessian
:
ForwardDiff.gradient
— MethodForwardDiff.gradient(model::LinearMixedModel)
Evaluate the gradient of the objective function at the currently fitted parameter values.
Most of MixedModels.jl relies strongly on in-place methods in order to minimize the amount of memory allocated. In addition to reducing the memory burden (especially for large models), this practice generally speeds up evaluation of the objective. In-place methods, however, generally do not play well with automatic differentiation. For the automatic differentiation support provided here, the developers instead implemented alternative, out-of-place methods. These will generally be slower and much more memory intensive, so use of this functionality is not recommended for large models.
Compatibility with ForwardDiff.jl is experimental. The precise structure, including function names and method definitions, is subject to change without being considered a breaking change. In particular, the exact set of parameters included is subject to change. The θ parameter is always included, but whether σ and/or the fixed effects should be included is currently still being decided.
ForwardDiff.hessian
— MethodForwardDiff.hessian(model::LinearMixedModel)
Evaluate the Hessian of the objective function at the currently fitted parameter values.
Most of MixedModels.jl relies strongly on in-place methods in order to minimize the amount of memory allocated. In addition to reducing the memory burden (especially for large models), this practice generally speeds up evaluation of the objective. In-place methods, however, generally do not play well with automatic differentiation. For the automatic differentiation support provided here, the developers instead implemented alternative, out-of-place methods. These will generally be slower and much more memory intensive, so use of this functionality is not recommended for large models.
Compatibility with ForwardDiff.jl is experimental. The precise structure, including function names and method definitions, is subject to change without being considered a breaking change. In particular, the exact set of parameters included is subject to change. The θ parameter is always included, but whether σ and/or the fixed effects should be included is currently still being decided.
Exact zero at optimum for trivial models
using MixedModels, MixedModelsDatasets, ForwardDiff
fm1 = lmm(@formula(yield ~ 1 + (1|batch)), MixedModelsDatasets.dataset(:dyestuff2))
Linear mixed model fit by maximum likelihood
yield ~ 1 + (1 | batch)
logLik -2 logLik AIC AICc BIC
-81.4365 162.8730 168.8730 169.7961 173.0766
Variance components:
Column Variance Std.Dev.
batch (Intercept) 0.00000 0.00000
Residual 13.34610 3.65323
Number of obs: 30; levels of grouping factors: 6
Fixed-effects parameters:
───────────────────────────────────────────────
Coef. Std. Error z Pr(>|z|)
───────────────────────────────────────────────
(Intercept) 5.6656 0.666986 8.49 <1e-16
───────────────────────────────────────────────
ForwardDiff.gradient(fm1)
1-element Vector{Float64}:
0.0
ForwardDiff.hessian(fm1)
1×1 Matrix{Float64}:
28.76868076413998
Approximate zero at optimum for non trivial models
fm2 = lmm(@formula(reaction ~ 1 + days + (1+days|subj)), MixedModelsDatasets.dataset(:sleepstudy))
Linear mixed model fit by maximum likelihood
reaction ~ 1 + days + (1 + days | subj)
logLik -2 logLik AIC AICc BIC
-875.9697 1751.9393 1763.9393 1764.4249 1783.0971
Variance components:
Column Variance Std.Dev. Corr.
subj (Intercept) 565.52071 23.78068
days 32.68242 5.71685 +0.08
Residual 654.94015 25.59180
Number of obs: 180; levels of grouping factors: 18
Fixed-effects parameters:
──────────────────────────────────────────────────
Coef. Std. Error z Pr(>|z|)
──────────────────────────────────────────────────
(Intercept) 251.405 6.6323 37.91 <1e-99
days 10.4673 1.50224 6.97 <1e-11
──────────────────────────────────────────────────
ForwardDiff.gradient(fm2)
3-element Vector{Float64}:
0.00014807478304845745
-0.00027151479254072797
0.0005646589082459741
ForwardDiff.hessian(fm2)
3×3 Matrix{Float64}:
45.4118 35.9372 6.356
35.9372 465.736 203.994
6.356 203.994 963.95
via FiniteDiff.jl
The core functionality is provided by defining appropriate methods for FiniteDiff.finite_difference_gradient
and FiniteDiff.finite_difference_hessian
:
FiniteDiff.finite_difference_gradient
— MethodFiniteDiff.finite_difference_gradient(model::LinearMixedModel, args...; kwargs...)
Evaluate the gradient of the objective function at the currently fitted parameter values.
Compatibility with FiniteDiff.jl is experimental. The precise structure, including function names and method definitions, is subject to change without being considered a breaking change. In particular, the exact set of parameters included is subject to change. The θ parameter is always included, but whether σ and/or the fixed effects should be included is currently still being decided.
FiniteDiff.finite_difference_hessian
— MethodFiniteDiff.finite_difference_hessian(model::LinearMixedModel, args...; kwargs...)
Evaluate the Hessian of the objective function at the currently fitted parameter values.
Compatibility with FiniteDiff.jl is experimental. The precise structure, including function names and method definitions, is subject to change without being considered a breaking change. In particular, the exact set of parameters included is subject to change. The θ parameter is always included, but whether σ and/or the fixed effects should be included is currently still being decided.
using FiniteDiff
FiniteDiff.finite_difference_gradient(fm2)
3-element Vector{Float64}:
0.00014822299314916167
-0.0002720018650722045
0.0005648056397592629
FiniteDiff.finite_difference_hessian(fm2)
3×3 LinearAlgebra.Symmetric{Float64, Matrix{Float64}}:
40.8869 31.5168 -14.5287
31.5168 461.419 183.595
-14.5287 183.595 867.555