API
Types defined in the package
GLM.DensePredChol
— Type.DensePredChol{T}
A LinPred
type with a dense Cholesky factorization of X'X
Members
X
: model matrix of sizen
×p
withn ≥ p
. Should be full column rank.beta0
: base coefficient vector of lengthp
delbeta
: increment to coefficient vector, also of lengthp
scratchbeta
: scratch vector of lengthp
, used inlinpred!
methodchol
: aCholesky
object created fromX'X
, possibly using row weights.scratchm1
: scratch Matrix{T} of the same size asX
scratchm2
: scratch Matrix{T} os the same size asX'X
GLM.DensePredQR
— Type.DensePredQR
A LinPred
type with a dense, unpivoted QR decomposition of X
Members
X
: Model matrix of sizen
×p
withn ≥ p
. Should be full column rank.beta0
: base coefficient vector of lengthp
delbeta
: increment to coefficient vector, also of lengthp
scratchbeta
: scratch vector of lengthp
, used inlinpred!
methodqr
: aQRCompactWY
object created fromX
, with optional row weights.
GLM.GlmResp
— Type.GlmResp
The response vector and various derived vectors in a generalized linear model.
GLM.LinearModel
— Type.GLM.LmResp
— Type.LmResp
Encapsulates the response for a linear model
Members
mu
: current value of the mean response vector or fitted valueoffset
: optional offset added to the linear predictor to formmu
wts
: optional vector of prior weightsy
: observed response vector
Either or both offset
and wts
may be of length 0
GLM.LinPred
— Type.LinPred
Abstract type representing a linear predictor
GLM.ModResp
— Type.ModResp
Abstract type representing a model response vector
Constructors for models
The most general approach to fitting a model is with the fit
function, as in
julia> using Random;
julia> fit(LinearModel, hcat(ones(10), 1:10), randn(MersenneTwister(12321), 10))
LinearModel{LmResp{Array{Float64,1}},DensePredChol{Float64,LinearAlgebra.Cholesky{Float64,Array{Float64,2}}}}:
Coefficients:
Estimate Std.Error t value Pr(>|t|)
x1 0.717436 0.775175 0.925515 0.3818
x2 -0.152062 0.124931 -1.21717 0.2582
This model can also be fit as
julia> using Random;
julia> lm(hcat(ones(10), 1:10), randn(MersenneTwister(12321), 10))
LinearModel{LmResp{Array{Float64,1}},DensePredChol{Float64,LinearAlgebra.Cholesky{Float64,Array{Float64,2}}}}:
Coefficients:
Estimate Std.Error t value Pr(>|t|)
x1 0.717436 0.775175 0.925515 0.3818
x2 -0.152062 0.124931 -1.21717 0.2582
Model methods
GLM.cancancel
— Function.cancancel(r::GlmResp{V,D,L})
Returns true
if dμ/dη for link L
is the variance function for distribution D
When L
is the canonical link for D
the derivative of the inverse link is a multiple of the variance function for D
. If they are the same a numerator and denominator term in the expression for the working weights will cancel.
GLM.delbeta!
— Function.delbeta!(p::LinPred, r::Vector)
Evaluate and return p.delbeta
the increment to the coefficient vector from residual r
StatsBase.deviance
— Function.deviance(obj::LinearModel)
For linear models, the deviance is equal to the residual sum of squares (RSS).
GLM.dispersion
— Function.dispersion(m::AbstractGLM, sqr::Bool=false)
Return the estimated dispersion (or scale) parameter for a model's distribution, generally written σ for linear models and ϕ for generalized linear models. It is, by definition, equal to 1 for the Bernoulli, Binomial, and Poisson families.
If sqr
is true
, the squared dispersion parameter is returned.
GLM.installbeta!
— Function.installbeta!(p::LinPred, f::Real=1.0)
Install pbeta0 .+= f * p.delbeta
and zero out p.delbeta
. Return the updated p.beta0
.
GLM.issubmodel
— Function.A helper function to determine if mod1 is nested in mod2
GLM.linpred!
— Function.linpred!(out, p::LinPred, f::Real=1.0)
Overwrite out
with the linear predictor from p
with factor f
The effective coefficient vector, p.scratchbeta
, is evaluated as p.beta0 .+ f * p.delbeta
, and out
is updated to p.X * p.scratchbeta
GLM.linpred
— Function.linpred(p::LinPred, f::Read=1.0)
Return the linear predictor p.X * (p.beta0 .+ f * p.delbeta)
GLM.lm
— Function.lm(X, y, allowrankdeficient::Bool=false)
An alias for fit(LinearModel, X, y, allowrankdeficient)
The arguments X
and y
can be a Matrix
and a Vector
or a Formula
and a DataFrame
.
StatsBase.nobs
— Function.nobs(obj::LinearModel)
nobs(obj::GLM)
For linear and generalized linear models, returns the number of rows, or, when prior weights are specified, the sum of weights.
StatsBase.nulldeviance
— Function.nulldeviance(obj::LinearModel)
For linear models, the deviance of the null model is equal to the total sum of squares (TSS).
StatsBase.predict
— Function.predict(mm::LinearModel, newx::AbstractMatrix;
interval::Union{Symbol,Nothing} = nothing, level::Real = 0.95)
If interval
is nothing
(the default), return a vector with the predicted values for model mm
and new data newx
. Otherwise, return a 3-column matrix with the prediction and the lower and upper confidence bounds for a given level
(0.95 equates alpha = 0.05). Valid values of interval
are :confidence
delimiting the uncertainty of the predicted relationship, and :prediction
delimiting estimated bounds for new data points.
predict(mm::AbstractGLM, newX::AbstractMatrix; offset::FPVector=Vector{eltype(newX)}(0))
Form the predicted response of model mm
from covariate values newX
and, optionally, an offset.
GLM.updateμ!
— Function.updateμ!{T<:FPVector}(r::GlmResp{T}, linPr::T)
Update the mean, working weights and working residuals, in r
given a value of the linear predictor, linPr
.
GLM.wrkresp
— Function.wrkresp(r::GlmResp)
The working response, r.eta + r.wrkresid - r.offset
.
GLM.wrkresp!
— Function.wrkresp!{T<:FPVector}(v::T, r::GlmResp{T})
Overwrite v
with the working response of r
Links and methods applied to them
GLM.Link
— Type.Link
An abstract type whose subtypes determine methods for linkfun
, linkinv
, mueta
, and inverselink
.
GLM.Link01
— Type.Link01
An abstract subtype of Link
which are links defined on (0, 1)
GLM.CauchitLink
— Type.CauchitLink
A Link01
corresponding to the standard Cauchy distribution, Distributions.Cauchy
.
GLM.CloglogLink
— Type.CloglogLink
A Link01
corresponding to the extreme value (or log-Wiebull) distribution. The link is the complementary log-log transformation, log(1 - log(-μ))
.
GLM.IdentityLink
— Type.IdentityLink
The canonical Link
for the Normal
distribution, defined as η = μ
.
GLM.InverseLink
— Type.InverseLink
The canonical Link
for Distributions.Gamma
distribution, defined as η = inv(μ)
.
GLM.InverseSquareLink
— Type.InverseSquareLink
The canonical Link
for Distributions.InverseGaussian
distribution, defined as η = inv(abs2(μ))
.
GLM.LogitLink
— Type.LogitLink
The canonical Link01
for Distributions.Bernoulli
and Distributions.Binomial
. The inverse link, linkinv
, is the c.d.f. of the standard logistic distribution, Distributions.Logistic
.
GLM.LogLink
— Type.LogLink
The canonical Link
for Distributions.Poisson
, defined as η = log(μ)
.
GLM.NegativeBinomialLink
— Type.NegativeBinomialLink
The canonical Link
for Distributions.NegativeBinomial
distribution, defined as η = log(μ/(μ+θ))
. The shape parameter θ has to be fixed for the distribution to belong to the exponential family.
GLM.ProbitLink
— Type.ProbitLink
A Link01
whose linkinv
is the c.d.f. of the standard normal distribution, Distributions.Normal()
.
GLM.SqrtLink
— Type.SqrtLink
A Link
defined as η = √μ
GLM.linkfun
— Function.linkfun(L::Link, μ)
Return η
, the value of the linear predictor for link L
at mean μ
.
Examples
julia> μ = inv(10):inv(5):1
0.1:0.2:0.9
julia> show(linkfun.(LogitLink(), μ))
[-2.19722, -0.847298, 0.0, 0.847298, 2.19722]
GLM.linkinv
— Function.linkinv(L::Link, η)
Return μ
, the mean value, for link L
at linear predictor value η
.
Examples
julia> μ = 0.1:0.2:1
0.1:0.2:0.9
julia> η = logit.(μ);
julia> linkinv.(LogitLink(), η) ≈ μ
true
GLM.mueta
— Function.mueta(L::Link, η)
Return the derivative of linkinv
, dμ/dη
, for link L
at linear predictor value η
.
Examples
julia> mueta(LogitLink(), 0.0)
0.25
julia> mueta(CloglogLink(), 0.0) ≈ 0.36787944117144233
true
julia> mueta(LogLink(), 2.0) ≈ 7.38905609893065
true
GLM.inverselink
— Function.inverselink(L::Link, η)
Return a 3-tuple of the inverse link, the derivative of the inverse link, and when appropriate, the variance function μ*(1 - μ)
.
The variance function is returned as NaN unless the range of μ is (0, 1)
Examples
julia> inverselink(LogitLink(), 0.0)
(0.5, 0.25, 0.25)
julia> μ, oneminusμ, variance = inverselink(CloglogLink(), 0.0);
julia> μ + oneminusμ ≈ 1
true
julia> μ*(1 - μ) ≈ variance
true
julia> isnan(last(inverselink(LogLink(), 2.0)))
true
GLM.canonicallink
— Function.canonicallink(D::Distribution)
Return the canonical link for distribution D
, which must be in the exponential family.
Examples
julia> canonicallink(Bernoulli())
LogitLink()
GLM.glmvar
— Function.glmvar(D::Distribution, μ)
Return the value of the variance function for D
at μ
The variance of D
at μ
is the product of the dispersion parameter, ϕ, which does not depend on μ
and the value of glmvar
. In other words glmvar
returns the factor of the variance that depends on μ
.
Examples
julia> μ = 1/6:1/3:1;
julia> glmvar.(Normal(), μ) # constant for Normal()
3-element Array{Float64,1}:
1.0
1.0
1.0
julia> glmvar.(Bernoulli(), μ) ≈ μ .* (1 .- μ)
true
julia> glmvar.(Poisson(), μ) == μ
true
GLM.mustart
— Function.mustart(D::Distribution, y, wt)
Return a starting value for μ.
For some distributions it is appropriate to set μ = y
to initialize the IRLS algorithm but for others, notably the Bernoulli, the values of y
are not allowed as values of μ
and must be modified.
Examples
julia> mustart(Bernoulli(), 0.0, 1) ≈ 1/4
true
julia> mustart(Bernoulli(), 1.0, 1) ≈ 3/4
true
julia> mustart(Binomial(), 0.0, 10) ≈ 1/22
true
julia> mustart(Normal(), 0.0, 1) ≈ 0
true
GLM.devresid
— Function.devresid(D, y, μ)
Return the squared deviance residual of μ
from y
for distribution D
The deviance of a GLM can be evaluated as the sum of the squared deviance residuals. This is the principal use for these values. The actual deviance residual, say for plotting, is the signed square root of this value
sign(y - μ) * sqrt(devresid(D, y, μ))
Examples
julia> devresid(Normal(), 0, 0.25) ≈ abs2(0.25)
true
julia> devresid(Bernoulli(), 1, 0.75) ≈ -2*log(0.75)
true
julia> devresid(Bernoulli(), 0, 0.25) ≈ -2*log1p(-0.25)
true
GLM.dispersion_parameter
— Function.dispersion_parameter(D) # not exported
Does distribution D
have a separate dispersion parameter, ϕ?
Returns false
for the Bernoulli
, Binomial
and Poisson
distributions, true
otherwise.
Examples
julia> show(GLM.dispersion_parameter(Normal()))
true
julia> show(GLM.dispersion_parameter(Bernoulli()))
false
GLM.loglik_obs
— Function.loglik_obs(D, y, μ, wt, ϕ) # not exported
Returns wt * logpdf(D(μ, ϕ), y)
where the parameters of D
are derived from μ
and ϕ
.
The wt
argument is a multiplier of the result except in the case of the Binomial
where wt
is the number of trials and μ
is the proportion of successes.
The loglikelihood of a fitted model is the sum of these values over all the observations.