Parametric tests

Power divergence test

HypothesisTests.PowerDivergenceTest — Type

PowerDivergenceTest(x[, y]; lambda = 1.0, theta0 = ones(length(x))/length(x))

Perform a Power Divergence test.

If y is not given and x is a matrix with one row or column, or x is a vector, then a goodness-of-fit test is performed (x is treated as a one-dimensional contingency table). In this case, the hypothesis tested is whether the population probabilities equal those in theta0, or are all equal if theta0 is not given.

If x is a matrix with at least two rows and columns, it is taken as a two-dimensional contingency table. Otherwise, x and y must be vectors of the same length. The contingency table is calculated using the counts function from the StatsBase package. Then the power divergence test is conducted under the null hypothesis that the joint distribution of the cell counts in a 2-dimensional contingency table is the product of the row and column marginals.

Note that the entries of x (and y if provided) must be non-negative integers.

Computed confidence intervals by default are Quesenberry-Hurst intervals if the minimum of the expected cell counts exceeds 100, and Sison-Glaz intervals otherwise. See the confint(::PowerDivergenceTest) documentation for a list of supported methods to compute confidence intervals.

The power divergence test is given by

\[ \dfrac{2}{λ(λ+1)}\sum_{i=1}^I \sum_{j=1}^J n_{ij} \left[(n_{ij} /\hat{n}_{ij})^λ -1\right]\]

where $n_{ij}$ is the cell count in the $i$ th row and $j$ th column and $λ$ is a real number determining the nature of the test to be performed:

$λ = 1$: equal to Pearson's chi-squared statistic
$λ \to 0$: converges to the likelihood ratio test statistic
$λ \to -1$: converges to the minimum discrimination information statistic (Gokhale and Kullback, 1978)
$λ = -2$: equals Neyman modified chi-squared (Neyman, 1949)
$λ = -1/2$: equals the Freeman-Tukey statistic (Freeman and Tukey, 1950).

Under regularity conditions, the asymptotic distributions are identical (see Drost et. al. 1989). The $χ^2$ null approximation works best for $λ$ near $2/3$.

Implements: pvalue, confint(::PowerDivergenceTest)

References

Agresti, Alan. Categorical Data Analysis, 3rd Edition. Wiley, 2013.

StatsAPI.confint — Method

confint(test::PowerDivergenceTest; level = 0.95, tail = :both, method = :auto)

Compute a confidence interval with coverage level for multinomial proportions using one of the following methods. Possible values for method are:

:auto (default): If the minimum of the expected cell counts exceeds 100, Quesenberry-Hurst intervals are used, otherwise Sison-Glaz.
:sison_glaz: Sison-Glaz intervals
:bootstrap: Bootstrap intervals
:quesenberry_hurst: Quesenberry-Hurst intervals
:gold: Gold intervals (asymptotic simultaneous intervals)

References

Agresti, Alan. Categorical Data Analysis, 3rd Edition. Wiley, 2013.
Sison, C.P and Glaz, J. Simultaneous confidence intervals and sample size determination for multinomial proportions. Journal of the American Statistical Association, 90:366-369, 1995.
Quesensberry, C.P. and Hurst, D.C. Large Sample Simultaneous Confidence Intervals for Multinational Proportions. Technometrics, 6:191-195, 1964.
Gold, R. Z. Tests Auxiliary to $χ^2$ Tests in a Markov Chain. Annals of Mathematical Statistics, 30:56-74, 1963.

Pearson chi-squared test

HypothesisTests.ChisqTest — Function

ChisqTest(x[, y][, theta0 = ones(length(x))/length(x)])

Perform a Pearson chi-squared test (equivalent to a PowerDivergenceTest with $λ = 1$).

If y is not given and x is a matrix with one row or column, or x is a vector, then a goodness-of-fit test is performed (x is treated as a one-dimensional contingency table). In this case, the hypothesis tested is whether the population probabilities equal those in theta0, or are all equal if theta0 is not given.

If only y and x are given and both are vectors of integer type, then once again a goodness-of-fit test is performed. In this case, theta0 is calculated by the proportion of each individual values in y. Here, the hypothesis tested is whether the two samples x and y come from the same population or not.

If x is a matrix with at least two rows and columns, it is taken as a two-dimensional contingency table. Otherwise, x and y must be vectors of the same length. The contingency table is calculated using counts function from the StatsBase package. Then the power divergence test is conducted under the null hypothesis that the joint distribution of the cell counts in a 2-dimensional contingency table is the product of the row and column marginals.

Note that the entries of x (and y if provided) must be non-negative integers.

Implements: pvalue, confint

Multinomial likelihood ratio test

HypothesisTests.MultinomialLRTest — Function

MultinomialLRTest(x[, y][, theta0 = ones(length(x))/length(x)])

Perform a multinomial likelihood ratio test (equivalent to a PowerDivergenceTest with $λ = 0$).

If y is not given and x is a matrix with one row or column, or x is a vector, then a goodness-of-fit test is performed (x is treated as a one-dimensional contingency table). In this case, the hypothesis tested is whether the population probabilities equal those in theta0, or are all equal if theta0 is not given.

If x is a matrix with at least two rows and columns, it is taken as a two-dimensional contingency table. Otherwise, x and y must be vectors of the same length. The contingency table is calculated using counts function from the StatsBase package. Then the power divergence test is conducted under the null hypothesis that the joint distribution of the cell counts in a 2-dimensional contingency table is the product of the row and column marginals.

Note that the entries of x (and y if provided) must be non-negative integers.

Implements: pvalue, confint

t-test

HypothesisTests.OneSampleTTest — Type

OneSampleTTest(xbar::Real, stddev::Real, n::Int, μ0::Real = 0)

Perform a one sample t-test of the null hypothesis that n values with mean xbar and sample standard deviation stddev come from a distribution with mean μ0 against the alternative hypothesis that the distribution does not have mean μ0.

Implements: pvalue, confint

OneSampleTTest(v::AbstractVector{T<:Real}, μ0::Real = 0)

Perform a one sample t-test of the null hypothesis that the data in vector v comes from a distribution with mean μ0 against the alternative hypothesis that the distribution does not have mean μ0.

Implements: pvalue, confint

OneSampleTTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real}, μ0::Real = 0)

Perform a paired sample t-test of the null hypothesis that the differences between pairs of values in vectors x and y come from a distribution with mean μ0 against the alternative hypothesis that the distribution does not have mean μ0.

Implements: pvalue, confint

Note

This test is also known as a t-test for paired or dependent samples, see paired difference test on Wikipedia.

HypothesisTests.EqualVarianceTTest — Type

EqualVarianceTTest(nx::Int, ny::Int, mx::Real, my::Real, vx::Real, vy::Real, μ0::Real=0)

Perform a two-sample t-test of the null hypothesis that samples x and y described by the number of elements nx and ny, the mean mx and my, and variance vx and vy come from distributions with equals means and variances. The alternative hypothesis is that the distributions have different means but equal variances.

Implements: pvalue, confint

EqualVarianceTTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real})

Perform a two-sample t-test of the null hypothesis that x and y come from distributions with equal means and variances against the alternative hypothesis that the distributions have different means but equal variances.

Implements: pvalue, confint

HypothesisTests.UnequalVarianceTTest — Type

UnequalVarianceTTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real})

Perform an unequal variance two-sample t-test of the null hypothesis that x and y come from distributions with equal means against the alternative hypothesis that the distributions have different means.

This test is sometimes known as Welch's t-test. It differs from the equal variance t-test in that it computes the number of degrees of freedom of the test using the Welch-Satterthwaite equation:

\[ ν_{χ'} ≈ \frac{\left(\sum_{i=1}^n k_i s_i^2\right)^2}{\sum_{i=1}^n \frac{(k_i s_i^2)^2}{ν_i}}\]

Implements: pvalue, confint

z-test

HypothesisTests.OneSampleZTest — Type

OneSampleZTest(xbar::Real, stddev::Real, n::Int, μ0::Real = 0)

Perform a one sample z-test of the null hypothesis that n values with mean xbar and population standard deviation stddev come from a distribution with mean μ0 against the alternative hypothesis that the distribution does not have mean μ0.

Implements: pvalue, confint

OneSampleZTest(v::AbstractVector{T<:Real}, μ0::Real = 0)

Perform a one sample z-test of the null hypothesis that the data in vector v comes from a distribution with mean μ0 against the alternative hypothesis that the distribution does not have mean μ0.

Implements: pvalue, confint

OneSampleZTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real}, μ0::Real = 0)

Perform a paired sample z-test of the null hypothesis that the differences between pairs of values in vectors x and y come from a distribution with mean μ0 against the alternative hypothesis that the distribution does not have mean μ0.

Implements: pvalue, confint

HypothesisTests.EqualVarianceZTest — Type

EqualVarianceZTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real})

Perform a two-sample z-test of the null hypothesis that x and y come from distributions with equal means and variances against the alternative hypothesis that the distributions have different means but equal variances.

Implements: pvalue, confint

HypothesisTests.UnequalVarianceZTest — Type

UnequalVarianceZTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real})

Perform an unequal variance two-sample z-test of the null hypothesis that x and y come from distributions with equal means against the alternative hypothesis that the distributions have different means.

Implements: pvalue, confint

F-test

HypothesisTests.VarianceFTest — Type

VarianceFTest(x::AbstractVector{<:Real}, y::AbstractVector{<:Real})

Perform an F-test of the null hypothesis that two real-valued vectors x and y have equal variances.

Implements: pvalue

References

George E. P. Box, "Non-Normality and Tests on Variances", Biometrika 40 (3/4): 318–335, 1953.

External links

F-test of equality of variances on Wikipedia

One-way ANOVA Test

HypothesisTests.OneWayANOVATest — Function

OneWayANOVATest(groups)
OneWayANOVATest(groups::AbstractVector{<:Real}...)

Perform one-way analysis of variance test of the hypothesis that that the groups means are equal.

The one-way analysis of variance (one-way ANOVA) is a technique that can be used to compare means of two or more samples. The ANOVA tests the null hypothesis, which states that samples in all groups are drawn from populations with the same mean values. To do this, two estimates are made of the population variance. The ANOVA produces an F-statistic, the ratio of the variance calculated among the means to the variance within the samples.

Implements: pvalue

External links

One-way analysis of variance on Wikipedia

Levene's Test

HypothesisTests.LeveneTest — Function

LeveneTest(groups; scorediff=abs, statistic=mean)
LeveneTest(groups::AbstractVector{<:Real}...; scorediff=abs, statistic=mean)

Perform Levene's test of the hypothesis that that the groups variances are equal. By default the mean statistic is used for centering in each of the groups, but other statistics are accepted: median or truncated mean, see BrownForsytheTest. By default the absolute value of the score difference, scorediff, is used, but other functions are accepted: x² or √|x|.

The test statistic, $W$, is equivalent to the $F$ statistic, and is defined as follows:

\[W = \frac{(N-k)}{(k-1)} \cdot \frac{\sum_{i=1}^k N_i (Z_{i\cdot}-Z_{\cdot\cdot})^2} {\sum_{i=1}^k \sum_{j=1}^{N_i} (Z_{ij}-Z_{i\cdot})^2},\]

where

$k$ is the number of different groups to which the sampled cases belong,
$N_i$ is the number of cases in the $i$th group,
$N$ is the total number of cases in all groups,
$Y_{ij}$ is the value of the measured variable for the $j$th case from the $i$th group,
$Z_{ij} = |Y_{ij} - \bar{Y}_{i\cdot}|$, $\bar{Y}_{i\cdot}$ is a mean of the $i$th group,
$Z_{i\cdot} = \frac{1}{N_i} \sum_{j=1}^{N_i} Z_{ij}$ is the mean of the $Z_{ij}$ for group $i$,
$Z_{\cdot\cdot} = \frac{1}{N} \sum_{i=1}^k \sum_{j=1}^{N_i} Z_{ij}$ is the mean of all $Z_{ij}$.

The test statistic $W$ is approximately $F$-distributed with $k-1$ and $N-k$ degrees of freedom.

Implements: pvalue

References

Levene, Howard, "Robust tests for equality of variances". In Ingram Olkin; Harold Hotelling; et al. (eds.). Contributions to Probability and Statistics: Essays in Honor of Harold Hotelling. Stanford University Press. pp. 278–292, 1960

External links

Levene's test on Wikipedia

Brown-Forsythe Test

HypothesisTests.BrownForsytheTest — Function

BrownForsytheTest(groups)
BrownForsytheTest(groups::AbstractVector{<:Real}...)

The Brown–Forsythe test is a statistical test for the equality of groups variances.

The Brown–Forsythe test is a modification of the Levene's test with the median instead of the mean statistic for computing the spread within each group.

Implements: pvalue

References

Brown, Morton B.; Forsythe, Alan B., "Robust tests for the equality of variances". Journal of the American Statistical Association. 69: 364–367, 1974 doi:10.1080/01621459.1974.10482955.

External links

Brown–Forsythe test on Wikipedia