API

API

Types defined in the package

DensePredChol{T}

A LinPred type with a dense Cholesky factorization of X'X

Members

  • X: model matrix of size n × p with n ≥ p. Should be full column rank.
  • beta0: base coefficient vector of length p
  • delbeta: increment to coefficient vector, also of length p
  • scratchbeta: scratch vector of length p, used in linpred! method
  • chol: a Cholesky object created from X'X, possibly using row weights.
  • scratchm1: scratch Matrix{T} of the same size as X
  • scratchm2: scratch Matrix{T} os the same size as X'X
source
GLM.DensePredQRType.
DensePredQR

A LinPred type with a dense, unpivoted QR decomposition of X

Members

  • X: Model matrix of size n × p with n ≥ p. Should be full column rank.
  • beta0: base coefficient vector of length p
  • delbeta: increment to coefficient vector, also of length p
  • scratchbeta: scratch vector of length p, used in linpred! method
  • qr: a QRCompactWY object created from X, with optional row weights.
source
GLM.GlmRespType.
GlmResp

The response vector and various derived vectors in a generalized linear model.

source
GLM.LinearModelType.
LinearModel

A combination of a LmResp and a LinPred

Members

  • rr: a LmResp object
  • pp: a LinPred object
source
GLM.LmRespType.
LmResp

Encapsulates the response for a linear model

Members

  • mu: current value of the mean response vector or fitted value
  • offset: optional offset added to the linear predictor to form mu
  • wts: optional vector of prior weights
  • y: observed response vector

Either or both offset and wts may be of length 0

source
GLM.LinPredType.
LinPred

Abstract type representing a linear predictor

source
GLM.ModRespType.
ModResp

Abstract type representing a model response vector

source

Constructors for models

The most general approach to fitting a model is with the fit function, as in

julia> using Random;

julia> fit(LinearModel, hcat(ones(10), 1:10), randn(MersenneTwister(12321), 10))
LinearModel{LmResp{Array{Float64,1}},DensePredChol{Float64,LinearAlgebra.Cholesky{Float64,Array{Float64,2}}}}:

Coefficients:
      Estimate Std.Error  t value Pr(>|t|)
x1    0.717436  0.775175 0.925515   0.3818
x2   -0.152062  0.124931 -1.21717   0.2582

This model can also be fit as

julia> using Random;

julia> lm(hcat(ones(10), 1:10), randn(MersenneTwister(12321), 10))
LinearModel{LmResp{Array{Float64,1}},DensePredChol{Float64,LinearAlgebra.Cholesky{Float64,Array{Float64,2}}}}:

Coefficients:
      Estimate Std.Error  t value Pr(>|t|)
x1    0.717436  0.775175 0.925515   0.3818
x2   -0.152062  0.124931 -1.21717   0.2582

Model methods

GLM.cancancelFunction.
cancancel(r::GlmResp{V,D,L})

Returns true if dμ/dη for link L is the variance function for distribution D

When L is the canonical link for D the derivative of the inverse link is a multiple of the variance function for D. If they are the same a numerator and denominator term in the expression for the working weights will cancel.

source
GLM.delbeta!Function.
delbeta!(p::LinPred, r::Vector)

Evaluate and return p.delbeta the increment to the coefficient vector from residual r

source
StatsBase.devianceFunction.
deviance(obj::LinearModel)

For linear models, the deviance is equal to the residual sum of squares (RSS).

source
GLM.dispersionFunction.
dispersion(m::AbstractGLM, sqr::Bool=false)

Return the estimated dispersion (or scale) parameter for a model's distribution, generally written σ for linear models and ϕ for generalized linear models. It is, by definition, equal to 1 for the Bernoulli, Binomial, and Poisson families.

If sqr is true, the squared dispersion parameter is returned.

source
GLM.installbeta!Function.
installbeta!(p::LinPred, f::Real=1.0)

Install pbeta0 .+= f * p.delbeta and zero out p.delbeta. Return the updated p.beta0.

source
GLM.issubmodelFunction.

A helper function to determine if mod1 is nested in mod2

source
GLM.linpred!Function.
linpred!(out, p::LinPred, f::Real=1.0)

Overwrite out with the linear predictor from p with factor f

The effective coefficient vector, p.scratchbeta, is evaluated as p.beta0 .+ f * p.delbeta, and out is updated to p.X * p.scratchbeta

source
GLM.linpredFunction.
linpred(p::LinPred, f::Read=1.0)

Return the linear predictor p.X * (p.beta0 .+ f * p.delbeta)

source
GLM.lmFunction.
lm(X, y, allowrankdeficient::Bool=false)

An alias for fit(LinearModel, X, y, allowrankdeficient)

The arguments X and y can be a Matrix and a Vector or a Formula and a DataFrame.

source
StatsBase.nobsFunction.
nobs(obj::LinearModel)
nobs(obj::GLM)

For linear and generalized linear models, returns the number of rows, or, when prior weights are specified, the sum of weights.

source
nulldeviance(obj::LinearModel)

For linear models, the deviance of the null model is equal to the total sum of squares (TSS).

source
StatsBase.predictFunction.
predict(mm::LinearModel, newx::AbstractMatrix;
        interval::Union{Symbol,Nothing} = nothing, level::Real = 0.95)

If interval is nothing (the default), return a vector with the predicted values for model mm and new data newx. Otherwise, return a 3-column matrix with the prediction and the lower and upper confidence bounds for a given level (0.95 equates alpha = 0.05). Valid values of interval are :confidence delimiting the uncertainty of the predicted relationship, and :prediction delimiting estimated bounds for new data points.

source
predict(mm::AbstractGLM, newX::AbstractMatrix; offset::FPVector=Vector{eltype(newX)}(0))

Form the predicted response of model mm from covariate values newX and, optionally, an offset.

source
GLM.updateμ!Function.
updateμ!{T<:FPVector}(r::GlmResp{T}, linPr::T)

Update the mean, working weights and working residuals, in r given a value of the linear predictor, linPr.

source
GLM.wrkrespFunction.
wrkresp(r::GlmResp)

The working response, r.eta + r.wrkresid - r.offset.

source
GLM.wrkresp!Function.
wrkresp!{T<:FPVector}(v::T, r::GlmResp{T})

Overwrite v with the working response of r

source

Links and methods applied to them

GLM.LinkType.
Link

An abstract type whose subtypes determine methods for linkfun, linkinv, mueta, and inverselink.

source
GLM.Link01Type.
Link01

An abstract subtype of Link which are links defined on (0, 1)

source
GLM.CauchitLinkType.
CauchitLink

A Link01 corresponding to the standard Cauchy distribution, Distributions.Cauchy.

source
GLM.CloglogLinkType.
CloglogLink

A Link01 corresponding to the extreme value (or log-Wiebull) distribution. The link is the complementary log-log transformation, log(1 - log(-μ)).

source
IdentityLink

The canonical Link for the Normal distribution, defined as η = μ.

source
GLM.InverseLinkType.
InverseLink

The canonical Link for Distributions.Gamma distribution, defined as η = inv(μ).

source
InverseSquareLink

The canonical Link for Distributions.InverseGaussian distribution, defined as η = inv(abs2(μ)).

source
GLM.LogitLinkType.
LogitLink

The canonical Link01 for Distributions.Bernoulli and Distributions.Binomial. The inverse link, linkinv, is the c.d.f. of the standard logistic distribution, Distributions.Logistic.

source
GLM.LogLinkType.
LogLink

The canonical Link for Distributions.Poisson, defined as η = log(μ).

source
NegativeBinomialLink

The canonical Link for Distributions.NegativeBinomial distribution, defined as η = log(μ/(μ+θ)). The shape parameter θ has to be fixed for the distribution to belong to the exponential family.

source
GLM.ProbitLinkType.
ProbitLink

A Link01 whose linkinv is the c.d.f. of the standard normal distribution, Distributions.Normal().

source
GLM.SqrtLinkType.
SqrtLink

A Link defined as η = √μ

source
GLM.linkfunFunction.
linkfun(L::Link, μ)

Return η, the value of the linear predictor for link L at mean μ.

Examples

julia> μ = inv(10):inv(5):1
0.1:0.2:0.9

julia> show(linkfun.(LogitLink(), μ))
[-2.19722, -0.847298, 0.0, 0.847298, 2.19722]
source
GLM.linkinvFunction.
linkinv(L::Link, η)

Return μ, the mean value, for link L at linear predictor value η.

Examples

julia> μ = 0.1:0.2:1
0.1:0.2:0.9

julia> η = logit.(μ);

julia> linkinv.(LogitLink(), η) ≈ μ
true
source
GLM.muetaFunction.
mueta(L::Link, η)

Return the derivative of linkinv, dμ/dη, for link L at linear predictor value η.

Examples

julia> mueta(LogitLink(), 0.0)
0.25

julia> mueta(CloglogLink(), 0.0) ≈ 0.36787944117144233
true

julia> mueta(LogLink(), 2.0) ≈ 7.38905609893065
true
source
GLM.inverselinkFunction.
inverselink(L::Link, η)

Return a 3-tuple of the inverse link, the derivative of the inverse link, and when appropriate, the variance function μ*(1 - μ).

The variance function is returned as NaN unless the range of μ is (0, 1)

Examples

julia> inverselink(LogitLink(), 0.0)
(0.5, 0.25, 0.25)

julia> μ, oneminusμ, variance = inverselink(CloglogLink(), 0.0);

julia> μ + oneminusμ ≈ 1
true

julia> μ*(1 - μ) ≈ variance
true

julia> isnan(last(inverselink(LogLink(), 2.0)))
true
source
GLM.canonicallinkFunction.
canonicallink(D::Distribution)

Return the canonical link for distribution D, which must be in the exponential family.

Examples

julia> canonicallink(Bernoulli())
LogitLink()
source
GLM.glmvarFunction.
glmvar(D::Distribution, μ)

Return the value of the variance function for D at μ

The variance of D at μ is the product of the dispersion parameter, ϕ, which does not depend on μ and the value of glmvar. In other words glmvar returns the factor of the variance that depends on μ.

Examples

julia> μ = 1/6:1/3:1;

julia> glmvar.(Normal(), μ)    # constant for Normal()
3-element Array{Float64,1}:
 1.0
 1.0
 1.0

julia> glmvar.(Bernoulli(), μ) ≈ μ .* (1 .- μ)
true

julia> glmvar.(Poisson(), μ) == μ
true
source
GLM.mustartFunction.
mustart(D::Distribution, y, wt)

Return a starting value for μ.

For some distributions it is appropriate to set μ = y to initialize the IRLS algorithm but for others, notably the Bernoulli, the values of y are not allowed as values of μ and must be modified.

Examples

julia> mustart(Bernoulli(), 0.0, 1) ≈ 1/4
true

julia> mustart(Bernoulli(), 1.0, 1) ≈ 3/4
true

julia> mustart(Binomial(), 0.0, 10) ≈ 1/22
true

julia> mustart(Normal(), 0.0, 1) ≈ 0
true
source
GLM.devresidFunction.
devresid(D, y, μ)

Return the squared deviance residual of μ from y for distribution D

The deviance of a GLM can be evaluated as the sum of the squared deviance residuals. This is the principal use for these values. The actual deviance residual, say for plotting, is the signed square root of this value

sign(y - μ) * sqrt(devresid(D, y, μ))

Examples

julia> devresid(Normal(), 0, 0.25) ≈ abs2(0.25)
true

julia> devresid(Bernoulli(), 1, 0.75) ≈ -2*log(0.75)
true

julia> devresid(Bernoulli(), 0, 0.25) ≈ -2*log1p(-0.25)
true
source
dispersion_parameter(D)  # not exported

Does distribution D have a separate dispersion parameter, ϕ?

Returns false for the Bernoulli, Binomial and Poisson distributions, true otherwise.

Examples

julia> show(GLM.dispersion_parameter(Normal()))
true
julia> show(GLM.dispersion_parameter(Bernoulli()))
false
source
GLM.loglik_obsFunction.
loglik_obs(D, y, μ, wt, ϕ)  # not exported

Returns wt * logpdf(D(μ, ϕ), y) where the parameters of D are derived from μ and ϕ.

The wt argument is a multiplier of the result except in the case of the Binomial where wt is the number of trials and μ is the proportion of successes.

The loglikelihood of a fitted model is the sum of these values over all the observations.

source