# Parametric tests

## Power divergence test

`HypothesisTests.PowerDivergenceTest`

— Type`PowerDivergenceTest(x[, y]; lambda = 1.0, theta0 = ones(length(x))/length(x))`

Perform a Power Divergence test.

If `y`

is not given and `x`

is a matrix with one row or column, or `x`

is a vector, then a goodness-of-fit test is performed (`x`

is treated as a one-dimensional contingency table). In this case, the hypothesis tested is whether the population probabilities equal those in `theta0`

, or are all equal if `theta0`

is not given.

If `x`

is a matrix with at least two rows and columns, it is taken as a two-dimensional contingency table. Otherwise, `x`

and `y`

must be vectors of the same length. The contingency table is calculated using the `counts`

function from the `StatsBase`

package. Then the power divergence test is conducted under the null hypothesis that the joint distribution of the cell counts in a 2-dimensional contingency table is the product of the row and column marginals.

Note that the entries of `x`

(and `y`

if provided) must be non-negative integers.

Computed confidence intervals by default are Quesenberry-Hurst intervals if the minimum of the expected cell counts exceeds 100, and Sison-Glaz intervals otherwise. See the `confint(::PowerDivergenceTest)`

documentation for a list of supported methods to compute confidence intervals.

The power divergence test is given by

\[ \dfrac{2}{λ(λ+1)}\sum_{i=1}^I \sum_{j=1}^J n_{ij} \left[(n_{ij} /\hat{n}_{ij})^λ -1\right]\]

where $n_{ij}$ is the cell count in the $i$ th row and $j$ th column and $λ$ is a real number determining the nature of the test to be performed:

- $λ = 1$: equal to Pearson's chi-squared statistic
- $λ \to 0$: converges to the likelihood ratio test statistic
- $λ \to -1$: converges to the minimum discrimination information statistic (Gokhale and Kullback, 1978)
- $λ = -2$: equals Neyman modified chi-squared (Neyman, 1949)
- $λ = -1/2$: equals the Freeman-Tukey statistic (Freeman and Tukey, 1950).

Under regularity conditions, the asymptotic distributions are identical (see Drost et. al. 1989). The $χ^2$ null approximation works best for $λ$ near $2/3$.

Implements: `pvalue`

, `confint(::PowerDivergenceTest)`

**References**

- Agresti, Alan. Categorical Data Analysis, 3rd Edition. Wiley, 2013.

`StatsAPI.confint`

— Method`confint(test::PowerDivergenceTest; level = 0.95, tail = :both, method = :auto)`

Compute a confidence interval with coverage `level`

for multinomial proportions using one of the following methods. Possible values for `method`

are:

`:auto`

(default): If the minimum of the expected cell counts exceeds 100, Quesenberry-Hurst intervals are used, otherwise Sison-Glaz.`:sison_glaz`

: Sison-Glaz intervals`:bootstrap`

: Bootstrap intervals`:quesenberry_hurst`

: Quesenberry-Hurst intervals`:gold`

: Gold intervals (asymptotic simultaneous intervals)

**References**

- Agresti, Alan. Categorical Data Analysis, 3rd Edition. Wiley, 2013.
- Sison, C.P and Glaz, J. Simultaneous confidence intervals and sample size determination for multinomial proportions. Journal of the American Statistical Association, 90:366-369, 1995.
- Quesensberry, C.P. and Hurst, D.C. Large Sample Simultaneous Confidence Intervals for Multinational Proportions. Technometrics, 6:191-195, 1964.
- Gold, R. Z. Tests Auxiliary to $χ^2$ Tests in a Markov Chain. Annals of Mathematical Statistics, 30:56-74, 1963.

## Pearson chi-squared test

`HypothesisTests.ChisqTest`

— Function`ChisqTest(x[, y][, theta0 = ones(length(x))/length(x)])`

Perform a Pearson chi-squared test (equivalent to a `PowerDivergenceTest`

with $λ = 1$).

If `y`

is not given and `x`

is a matrix with one row or column, or `x`

is a vector, then a goodness-of-fit test is performed (`x`

is treated as a one-dimensional contingency table). In this case, the hypothesis tested is whether the population probabilities equal those in `theta0`

, or are all equal if `theta0`

is not given.

If only `y`

and `x`

are given and both are vectors of integer type, then once again a goodness-of-fit test is performed. In this case, `theta0`

is calculated by the proportion of each individual values in `y`

. Here, the hypothesis tested is whether the two samples `x`

and `y`

come from the same population or not.

If `x`

is a matrix with at least two rows and columns, it is taken as a two-dimensional contingency table. Otherwise, `x`

and `y`

must be vectors of the same length. The contingency table is calculated using `counts`

function from the `StatsBase`

package. Then the power divergence test is conducted under the null hypothesis that the joint distribution of the cell counts in a 2-dimensional contingency table is the product of the row and column marginals.

Note that the entries of `x`

(and `y`

if provided) must be non-negative integers.

## Multinomial likelihood ratio test

`HypothesisTests.MultinomialLRTest`

— Function`MultinomialLRTest(x[, y][, theta0 = ones(length(x))/length(x)])`

Perform a multinomial likelihood ratio test (equivalent to a `PowerDivergenceTest`

with $λ = 0$).

If `y`

is not given and `x`

is a matrix with one row or column, or `x`

is a vector, then a goodness-of-fit test is performed (`x`

is treated as a one-dimensional contingency table). In this case, the hypothesis tested is whether the population probabilities equal those in `theta0`

, or are all equal if `theta0`

is not given.

If `x`

is a matrix with at least two rows and columns, it is taken as a two-dimensional contingency table. Otherwise, `x`

and `y`

must be vectors of the same length. The contingency table is calculated using `counts`

function from the `StatsBase`

package. Then the power divergence test is conducted under the null hypothesis that the joint distribution of the cell counts in a 2-dimensional contingency table is the product of the row and column marginals.

Note that the entries of `x`

(and `y`

if provided) must be non-negative integers.

## t-test

`HypothesisTests.OneSampleTTest`

— Type`OneSampleTTest(xbar::Real, stddev::Real, n::Int, μ0::Real = 0)`

Perform a one sample t-test of the null hypothesis that `n`

values with mean `xbar`

and sample standard deviation `stddev`

come from a distribution with mean `μ0`

against the alternative hypothesis that the distribution does not have mean `μ0`

.

`OneSampleTTest(v::AbstractVector{T<:Real}, μ0::Real = 0)`

Perform a one sample t-test of the null hypothesis that the data in vector `v`

comes from a distribution with mean `μ0`

against the alternative hypothesis that the distribution does not have mean `μ0`

.

`OneSampleTTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real}, μ0::Real = 0)`

Perform a paired sample t-test of the null hypothesis that the differences between pairs of values in vectors `x`

and `y`

come from a distribution with mean `μ0`

against the alternative hypothesis that the distribution does not have mean `μ0`

.

This test is also known as a t-test for paired or dependent samples, see paired difference test on Wikipedia.

`HypothesisTests.EqualVarianceTTest`

— Type`EqualVarianceTTest(nx::Int, ny::Int, mx::Real, my::Real, vx::Real, vy::Real, μ0::Real=0)`

Perform a two-sample t-test of the null hypothesis that samples `x`

and `y`

described by the number of elements `nx`

and `ny`

, the mean `mx`

and `my`

, and variance `vx`

and `vy`

come from distributions with equals means and variances. The alternative hypothesis is that the distributions have different means but equal variances.

`EqualVarianceTTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real})`

Perform a two-sample t-test of the null hypothesis that `x`

and `y`

come from distributions with equal means and variances against the alternative hypothesis that the distributions have different means but equal variances.

`HypothesisTests.UnequalVarianceTTest`

— Type`UnequalVarianceTTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real})`

Perform an unequal variance two-sample t-test of the null hypothesis that `x`

and `y`

come from distributions with equal means against the alternative hypothesis that the distributions have different means.

This test is sometimes known as Welch's t-test. It differs from the equal variance t-test in that it computes the number of degrees of freedom of the test using the Welch-Satterthwaite equation:

\[ ν_{χ'} ≈ \frac{\left(\sum_{i=1}^n k_i s_i^2\right)^2}{\sum_{i=1}^n \frac{(k_i s_i^2)^2}{ν_i}}\]

## z-test

`HypothesisTests.OneSampleZTest`

— Type`OneSampleZTest(xbar::Real, stddev::Real, n::Int, μ0::Real = 0)`

Perform a one sample z-test of the null hypothesis that `n`

values with mean `xbar`

and population standard deviation `stddev`

come from a distribution with mean `μ0`

against the alternative hypothesis that the distribution does not have mean `μ0`

.

`OneSampleZTest(v::AbstractVector{T<:Real}, μ0::Real = 0)`

Perform a one sample z-test of the null hypothesis that the data in vector `v`

comes from a distribution with mean `μ0`

against the alternative hypothesis that the distribution does not have mean `μ0`

.

`OneSampleZTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real}, μ0::Real = 0)`

Perform a paired sample z-test of the null hypothesis that the differences between pairs of values in vectors `x`

and `y`

come from a distribution with mean `μ0`

against the alternative hypothesis that the distribution does not have mean `μ0`

.

`HypothesisTests.EqualVarianceZTest`

— Type`EqualVarianceZTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real})`

Perform a two-sample z-test of the null hypothesis that `x`

and `y`

come from distributions with equal means and variances against the alternative hypothesis that the distributions have different means but equal variances.

`HypothesisTests.UnequalVarianceZTest`

— Type`UnequalVarianceZTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real})`

Perform an unequal variance two-sample z-test of the null hypothesis that `x`

and `y`

come from distributions with equal means against the alternative hypothesis that the distributions have different means.

## F-test

`HypothesisTests.VarianceFTest`

— Type`VarianceFTest(x::AbstractVector{<:Real}, y::AbstractVector{<:Real})`

Perform an F-test of the null hypothesis that two real-valued vectors `x`

and `y`

have equal variances.

Implements: `pvalue`

**References**

- George E. P. Box, "Non-Normality and Tests on Variances", Biometrika 40 (3/4): 318–335, 1953.

**External links**

## One-way ANOVA Test

`HypothesisTests.OneWayANOVATest`

— Function`OneWayANOVATest(groups::AbstractVector{<:Real}...)`

Perform one-way analysis of variance test of the hypothesis that that the `groups`

means are equal.

The one-way analysis of variance (one-way ANOVA) is a technique that can be used to compare means of two or more samples. The ANOVA tests the null hypothesis, which states that samples in all groups are drawn from populations with the same mean values. To do this, two estimates are made of the population variance. The ANOVA produces an F-statistic, the ratio of the variance calculated among the means to the variance within the samples.

Implements: `pvalue`

**External links**

## Levene's Test

`HypothesisTests.LeveneTest`

— Function`LeveneTest(groups::AbstractVector{<:Real}...; scorediff=abs, statistic=mean)`

Perform Levene's test of the hypothesis that that the `groups`

variances are equal. By default the mean `statistic`

is used for centering in each of the `groups`

, but other statistics are accepted: median or truncated mean, see `BrownForsytheTest`

. By default the absolute value of the score difference, `scorediff`

, is used, but other functions are accepted: x² or √|x|.

The test statistic, $W$, is equivalent to the $F$ statistic, and is defined as follows:

\[W = \frac{(N-k)}{(k-1)} \cdot \frac{\sum_{i=1}^k N_i (Z_{i\cdot}-Z_{\cdot\cdot})^2} {\sum_{i=1}^k \sum_{j=1}^{N_i} (Z_{ij}-Z_{i\cdot})^2},\]

where

- $k$ is the number of different groups to which the sampled cases belong,
- $N_i$ is the number of cases in the $i$th group,
- $N$ is the total number of cases in all groups,
- $Y_{ij}$ is the value of the measured variable for the $j$th case from the $i$th group,
- $Z_{ij} = |Y_{ij} - \bar{Y}_{i\cdot}|$, $\bar{Y}_{i\cdot}$ is a mean of the $i$th group,
- $Z_{i\cdot} = \frac{1}{N_i} \sum_{j=1}^{N_i} Z_{ij}$ is the mean of the $Z_{ij}$ for group $i$,
- $Z_{\cdot\cdot} = \frac{1}{N} \sum_{i=1}^k \sum_{j=1}^{N_i} Z_{ij}$ is the mean of all $Z_{ij}$.

The test statistic $W$ is approximately $F$-distributed with $k-1$ and $N-k$ degrees of freedom.

Implements: `pvalue`

**References**

- Levene, Howard, "Robust tests for equality of variances". In Ingram Olkin; Harold Hotelling; et al. (eds.). Contributions to Probability and Statistics: Essays in Honor of Harold Hotelling. Stanford University Press. pp. 278–292, 1960

**External links**

## Brown-Forsythe Test

`HypothesisTests.BrownForsytheTest`

— Function`BrownForsytheTest(groups::AbstractVector{<:Real}...)`

The Brown–Forsythe test is a statistical test for the equality of `groups`

variances.

The Brown–Forsythe test is a modification of the Levene's test with the median instead of the mean statistic for computing the spread within each group.

Implements: `pvalue`

**References**

- Brown, Morton B.; Forsythe, Alan B., "Robust tests for the equality of variances". Journal of the American Statistical Association. 69: 364–367, 1974 doi:10.1080/01621459.1974.10482955.

**External links**