API

The exported symbols from this package define its interface. Some symbols from other packages are re-exported for convenience. Fields of objects with composite types should not be accessed directly; the internals of any given structure may change at any time and this would not be considered a breaking change.

Fitting a model

BetaRegression.BetaRegressionModel — Type

BetaRegressionModel{T,L1,L2,V,M} <: RegressionModel

Type representing a regression model for beta-distributed response values in the open interval (0, 1), as described by Ferrari and Cribari-Neto (2004).

The mean response is linked to the linear predictor by a link function with type L1 <: Link01, i.e. the link must map $(0, 1) \mapsto \mathbb{R}$ and use the GLM package's interface for link functions. While there is no canonical link function for the beta regression model as there is for GLMs, logit is the most common choice.

The precision is transformed by a link function with type L2 <: Link which should map $\mathbb{R} \mapsto \mathbb{R}$ or, ideally, $(0, \infty) \mapsto \mathbb{R}$ because the precision must be positive. The most common choices are the identity, log, and square root links.

source

BetaRegression.BetaRegressionModel — Method

BetaRegressionModel(X, y, link=LogitLink(), precisionlink=IdentityLink();
                    weights=nothing, offset=nothing)

Construct a BetaRegressionModel object with the given model matrix X, response y, mean link function link, precision link function precisionlink, and optionally weights and offset. Note that the returned object is not fit until fit! is called on it.

Warning

Support for user-provided weights is currently incomplete; passing a value other than nothing or an empty array for weights will result in an error for now.

source

StatsAPI.fit — Method

fit(BetaRegressionModel, formula, data, link=LogitLink(), precisionlink=IdentityLink();
    kwargs...)

Fit a BetaRegressionModel to the given table data, which may be any Tables.jl-compatible table (e.g. a DataFrame), using the given formula, which can be constructed using @formula. In this method, the response and model matrix are determined from the formula and table. It is also possible to provide them explicitly.

fit(BetaRegressionModel, X::AbstractMatrix, y::AbstractVector, link=LogitLink(),
    precisionlink=IdentityLink(); kwargs...)

Fit a beta regression model using the provided model matrix X and response vector y. In both of these methods, a link function may be provided, otherwise the default logit link is used. Similarly, a link for the precision may be provided, otherwise the default identity link is used.

Keyword Arguments

weights: A vector of weights or nothing (default). Currently only nothing is accepted.
offset: An offset vector to be added to the linear predictor or nothing (default).
maxiter: Maximum number of Fisher scoring iterations to use when fitting. Default is 100.
atol: Absolute tolerance to use when checking for model convergence. Default is sqrt(eps(T)) where T is the type of the estimates.
rtol: Relative tolerance to use when checking for convergence. Default is the Base default relative tolerance for T.

Tip

If you experience convergence issues, you may consider trying a different link for the precision; LogLink() is a common choice. Increasing the maximum number of iterations may also be beneficial, especially when working with Float32.

source

StatsAPI.fit! — Method

fit!(b::BetaRegressionModel{T}; maxiter=100, atol=sqrt(eps(T)), rtol=Base.rtoldefault(T))

Fit the given BetaRegressionModel, updating its values in-place. If model convergence is achieved, b is returned, otherwise a ConvergenceException is thrown.

Fitting the model consists of computing the maximum likelihood estimates for the coefficients and precision parameter via Fisher scoring with analytic derivatives. The model is determined to have converged when the score vector, i.e. the vector of first partial derivatives of the log likelihood with respect to the parameters, is approximately zero. This is determined by isapprox using the specified atol and rtol. maxiter dictates the maximum number of Fisher scoring iterations.

source

Properties of a model

StatsAPI.aic — Function

aic(model::StatisticalModel)

Akaike's Information Criterion, defined as $-2 \log L + 2k$, with $L$ the likelihood of the model, and k its number of consumed degrees of freedom (as returned by dof).

StatsAPI.aicc — Function

aicc(model::StatisticalModel)

Corrected Akaike's Information Criterion for small sample sizes (Hurvich and Tsai 1989), defined as $-2 \log L + 2k + 2k(k-1)/(n-k-1)$, with $L$ the likelihood of the model, $k$ its number of consumed degrees of freedom (as returned by dof), and $n$ the number of observations (as returned by nobs).

StatsAPI.bic — Function

bic(model::StatisticalModel)

Bayesian Information Criterion, defined as $-2 \log L + k \log n$, with $L$ the likelihood of the model, $k$ its number of consumed degrees of freedom (as returned by dof), and $n$ the number of observations (as returned by nobs).

StatsAPI.coef — Method

coef(model::BetaRegressionModel)

Return a copy of the vector of regression coefficients $\mathbf{\beta}$.

Link functions

This package employs the system for link functions defined by the GLM.jl package. In short, each link function has its own concrete type which subtypes Link. Some may actually subtype Link01, which is itself a subtype of Link; this denotes that the function's domain is the open unit interval, $(0, 1)$. Link functions are applied with linkfun and their inverse is applied with linkinv. Relevant docstrings from GLM.jl are reproduced below.

Any mention of "the" link function for a BetaRegressionModel refers to that applied to the mean (at least in this document). However, despite only having one linear predictor, BetaRegressionModels actually have two link functions: one for the mean and one for the precision.

Mean

GLM.Link01 — Type

Link01

An abstract subtype of Link which are links defined on (0, 1)

GLM.LogitLink — Type

LogitLink

The canonical Link01 for Distributions.Bernoulli and Distributions.Binomial. The inverse link, linkinv, is the c.d.f. of the standard logistic distribution, Distributions.Logistic.

GLM.CauchitLink — Type

CauchitLink

A Link01 corresponding to the standard Cauchy distribution, Distributions.Cauchy.

GLM.CloglogLink — Type

CloglogLink

A Link01 corresponding to the extreme value (or log-Weibull) distribution. The link is the complementary log-log transformation, log(1 - log(-μ)).

GLM.ProbitLink — Type

ProbitLink

A Link01 whose linkinv is the c.d.f. of the standard normal distribution, Distributions.Normal().

Precision

GLM.IdentityLink — Type

IdentityLink

The canonical Link for the Normal distribution, defined as η = μ.

GLM.InverseLink — Type

InverseLink

The canonical Link for Distributions.Gamma distribution, defined as η = inv(μ).

GLM.InverseSquareLink — Type

InverseSquareLink

The canonical Link for Distributions.InverseGaussian distribution, defined as η = inv(abs2(μ)).

GLM.LogLink — Type

LogLink

The canonical Link for Distributions.Poisson, defined as η = log(μ).

GLM.PowerLink — Type

PowerLink

A Link defined as η = μ^λ when λ ≠ 0, and to η = log(μ) when λ = 0, i.e. the class of transforms that use a power function or logarithmic function.

Many other links are special cases of PowerLink:

IdentityLink when λ = 1.
SqrtLink when λ = 0.5.
LogLink when λ = 0.
InverseLink when λ = -1.
InverseSquareLink when λ = -2.

GLM.SqrtLink — Type

SqrtLink

A Link defined as η = √μ

Developer documentation

This section documents some functions that are not user facing (and are thus not exported) and may be removed at any time. They're included here for the benefit of anyone looking to contribute to the package and wondering how certain internals work. Other internal functions may be documented with comments in the source code rather than with docstrings; read the source directly for more information on those.

BetaRegression.dmueta — Function

dmueta(link::Link, η)

Return the second derivative of linkinv, $\frac{\partial^2 \mu}{\partial \eta^2}$, of the link function link evaluated at the linear predictor value η. A method of this function must be defined for a particular link function in order to compute the observed information matrix.

source

BetaRegression.initialize! — Function

initialize!(b::BetaRegressionModel)

Initialize the given BetaRegressionModel by computing starting points for the parameter estimates and return the updated model object. The initial estimates are based on those from a linear regression model with the same model matrix as b but with linkfun.(Link(b), response(b)) as the response.

If the initial estimate of the precision is invalid (not strictly positive) then it is taken instead to be 1 prior to applying the precision link function.

source