Parametric tests
Power divergence test
HypothesisTests.PowerDivergenceTest
— TypePowerDivergenceTest(x[, y]; lambda = 1.0, theta0 = ones(length(x))/length(x))
Perform a Power Divergence test.
If y
is not given and x
is a matrix with one row or column, or x
is a vector, then a goodness-of-fit test is performed (x
is treated as a one-dimensional contingency table). In this case, the hypothesis tested is whether the population probabilities equal those in theta0
, or are all equal if theta0
is not given.
If x
is a matrix with at least two rows and columns, it is taken as a two-dimensional contingency table. Otherwise, x
and y
must be vectors of the same length. The contingency table is calculated using the counts
function from the StatsBase
package. Then the power divergence test is conducted under the null hypothesis that the joint distribution of the cell counts in a 2-dimensional contingency table is the product of the row and column marginals.
Note that the entries of x
(and y
if provided) must be non-negative integers.
Computed confidence intervals by default are Quesenberry-Hurst intervals if the minimum of the expected cell counts exceeds 100, and Sison-Glaz intervals otherwise. See the confint(::PowerDivergenceTest)
documentation for a list of supported methods to compute confidence intervals.
The power divergence test is given by
\[ \dfrac{2}{λ(λ+1)}\sum_{i=1}^I \sum_{j=1}^J n_{ij} \left[(n_{ij} /\hat{n}_{ij})^λ -1\right]\]
where $n_{ij}$ is the cell count in the $i$ th row and $j$ th column and $λ$ is a real number determining the nature of the test to be performed:
- $λ = 1$: equal to Pearson's chi-squared statistic
- $λ \to 0$: converges to the likelihood ratio test statistic
- $λ \to -1$: converges to the minimum discrimination information statistic (Gokhale and Kullback, 1978)
- $λ = -2$: equals Neyman modified chi-squared (Neyman, 1949)
- $λ = -1/2$: equals the Freeman-Tukey statistic (Freeman and Tukey, 1950).
Under regularity conditions, the asymptotic distributions are identical (see Drost et. al. 1989). The $χ^2$ null approximation works best for $λ$ near $2/3$.
Implements: pvalue
, confint(::PowerDivergenceTest)
References
- Agresti, Alan. Categorical Data Analysis, 3rd Edition. Wiley, 2013.
StatsAPI.confint
— Methodconfint(test::PowerDivergenceTest; level = 0.95, tail = :both, method = :auto)
Compute a confidence interval with coverage level
for multinomial proportions using one of the following methods. Possible values for method
are:
:auto
(default): If the minimum of the expected cell counts exceeds 100, Quesenberry-Hurst intervals are used, otherwise Sison-Glaz.:sison_glaz
: Sison-Glaz intervals:bootstrap
: Bootstrap intervals:quesenberry_hurst
: Quesenberry-Hurst intervals:gold
: Gold intervals (asymptotic simultaneous intervals)
References
- Agresti, Alan. Categorical Data Analysis, 3rd Edition. Wiley, 2013.
- Sison, C.P and Glaz, J. Simultaneous confidence intervals and sample size determination for multinomial proportions. Journal of the American Statistical Association, 90:366-369, 1995.
- Quesensberry, C.P. and Hurst, D.C. Large Sample Simultaneous Confidence Intervals for Multinational Proportions. Technometrics, 6:191-195, 1964.
- Gold, R. Z. Tests Auxiliary to $χ^2$ Tests in a Markov Chain. Annals of Mathematical Statistics, 30:56-74, 1963.
Pearson chi-squared test
HypothesisTests.ChisqTest
— FunctionChisqTest(x[, y][, theta0 = ones(length(x))/length(x)])
Perform a Pearson chi-squared test (equivalent to a PowerDivergenceTest
with $λ = 1$).
If y
is not given and x
is a matrix with one row or column, or x
is a vector, then a goodness-of-fit test is performed (x
is treated as a one-dimensional contingency table). In this case, the hypothesis tested is whether the population probabilities equal those in theta0
, or are all equal if theta0
is not given.
If only y
and x
are given and both are vectors of integer type, then once again a goodness-of-fit test is performed. In this case, theta0
is calculated by the proportion of each individual values in y
. Here, the hypothesis tested is whether the two samples x
and y
come from the same population or not.
If x
is a matrix with at least two rows and columns, it is taken as a two-dimensional contingency table. Otherwise, x
and y
must be vectors of the same length. The contingency table is calculated using counts
function from the StatsBase
package. Then the power divergence test is conducted under the null hypothesis that the joint distribution of the cell counts in a 2-dimensional contingency table is the product of the row and column marginals.
Note that the entries of x
(and y
if provided) must be non-negative integers.
Multinomial likelihood ratio test
HypothesisTests.MultinomialLRTest
— FunctionMultinomialLRTest(x[, y][, theta0 = ones(length(x))/length(x)])
Perform a multinomial likelihood ratio test (equivalent to a PowerDivergenceTest
with $λ = 0$).
If y
is not given and x
is a matrix with one row or column, or x
is a vector, then a goodness-of-fit test is performed (x
is treated as a one-dimensional contingency table). In this case, the hypothesis tested is whether the population probabilities equal those in theta0
, or are all equal if theta0
is not given.
If x
is a matrix with at least two rows and columns, it is taken as a two-dimensional contingency table. Otherwise, x
and y
must be vectors of the same length. The contingency table is calculated using counts
function from the StatsBase
package. Then the power divergence test is conducted under the null hypothesis that the joint distribution of the cell counts in a 2-dimensional contingency table is the product of the row and column marginals.
Note that the entries of x
(and y
if provided) must be non-negative integers.
t-test
HypothesisTests.OneSampleTTest
— TypeOneSampleTTest(xbar::Real, stddev::Real, n::Int, μ0::Real = 0)
Perform a one sample t-test of the null hypothesis that n
values with mean xbar
and sample standard deviation stddev
come from a distribution with mean μ0
against the alternative hypothesis that the distribution does not have mean μ0
.
OneSampleTTest(v::AbstractVector{T<:Real}, μ0::Real = 0)
Perform a one sample t-test of the null hypothesis that the data in vector v
comes from a distribution with mean μ0
against the alternative hypothesis that the distribution does not have mean μ0
.
OneSampleTTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real}, μ0::Real = 0)
Perform a paired sample t-test of the null hypothesis that the differences between pairs of values in vectors x
and y
come from a distribution with mean μ0
against the alternative hypothesis that the distribution does not have mean μ0
.
This test is also known as a t-test for paired or dependent samples, see paired difference test on Wikipedia.
HypothesisTests.EqualVarianceTTest
— TypeEqualVarianceTTest(nx::Int, ny::Int, mx::Real, my::Real, vx::Real, vy::Real, μ0::Real=0)
Perform a two-sample t-test of the null hypothesis that samples x
and y
described by the number of elements nx
and ny
, the mean mx
and my
, and variance vx
and vy
come from distributions with equals means and variances. The alternative hypothesis is that the distributions have different means but equal variances.
EqualVarianceTTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real})
Perform a two-sample t-test of the null hypothesis that x
and y
come from distributions with equal means and variances against the alternative hypothesis that the distributions have different means but equal variances.
HypothesisTests.UnequalVarianceTTest
— TypeUnequalVarianceTTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real})
Perform an unequal variance two-sample t-test of the null hypothesis that x
and y
come from distributions with equal means against the alternative hypothesis that the distributions have different means.
This test is sometimes known as Welch's t-test. It differs from the equal variance t-test in that it computes the number of degrees of freedom of the test using the Welch-Satterthwaite equation:
\[ ν_{χ'} ≈ \frac{\left(\sum_{i=1}^n k_i s_i^2\right)^2}{\sum_{i=1}^n \frac{(k_i s_i^2)^2}{ν_i}}\]
z-test
HypothesisTests.OneSampleZTest
— TypeOneSampleZTest(xbar::Real, stddev::Real, n::Int, μ0::Real = 0)
Perform a one sample z-test of the null hypothesis that n
values with mean xbar
and population standard deviation stddev
come from a distribution with mean μ0
against the alternative hypothesis that the distribution does not have mean μ0
.
OneSampleZTest(v::AbstractVector{T<:Real}, μ0::Real = 0)
Perform a one sample z-test of the null hypothesis that the data in vector v
comes from a distribution with mean μ0
against the alternative hypothesis that the distribution does not have mean μ0
.
OneSampleZTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real}, μ0::Real = 0)
Perform a paired sample z-test of the null hypothesis that the differences between pairs of values in vectors x
and y
come from a distribution with mean μ0
against the alternative hypothesis that the distribution does not have mean μ0
.
HypothesisTests.EqualVarianceZTest
— TypeEqualVarianceZTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real})
Perform a two-sample z-test of the null hypothesis that x
and y
come from distributions with equal means and variances against the alternative hypothesis that the distributions have different means but equal variances.
HypothesisTests.UnequalVarianceZTest
— TypeUnequalVarianceZTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real})
Perform an unequal variance two-sample z-test of the null hypothesis that x
and y
come from distributions with equal means against the alternative hypothesis that the distributions have different means.
F-test
HypothesisTests.VarianceFTest
— TypeVarianceFTest(x::AbstractVector{<:Real}, y::AbstractVector{<:Real})
Perform an F-test of the null hypothesis that two real-valued vectors x
and y
have equal variances.
Implements: pvalue
References
- George E. P. Box, "Non-Normality and Tests on Variances", Biometrika 40 (3/4): 318–335, 1953.
External links
One-way ANOVA Test
HypothesisTests.OneWayANOVATest
— FunctionOneWayANOVATest(groups::AbstractVector{<:Real}...)
Perform one-way analysis of variance test of the hypothesis that that the groups
means are equal.
The one-way analysis of variance (one-way ANOVA) is a technique that can be used to compare means of two or more samples. The ANOVA tests the null hypothesis, which states that samples in all groups are drawn from populations with the same mean values. To do this, two estimates are made of the population variance. The ANOVA produces an F-statistic, the ratio of the variance calculated among the means to the variance within the samples.
Implements: pvalue
External links
Levene's Test
HypothesisTests.LeveneTest
— FunctionLeveneTest(groups::AbstractVector{<:Real}...; scorediff=abs, statistic=mean)
Perform Levene's test of the hypothesis that that the groups
variances are equal. By default the mean statistic
is used for centering in each of the groups
, but other statistics are accepted: median or truncated mean, see BrownForsytheTest
. By default the absolute value of the score difference, scorediff
, is used, but other functions are accepted: x² or √|x|.
The test statistic, $W$, is equivalent to the $F$ statistic, and is defined as follows:
\[W = \frac{(N-k)}{(k-1)} \cdot \frac{\sum_{i=1}^k N_i (Z_{i\cdot}-Z_{\cdot\cdot})^2} {\sum_{i=1}^k \sum_{j=1}^{N_i} (Z_{ij}-Z_{i\cdot})^2},\]
where
- $k$ is the number of different groups to which the sampled cases belong,
- $N_i$ is the number of cases in the $i$th group,
- $N$ is the total number of cases in all groups,
- $Y_{ij}$ is the value of the measured variable for the $j$th case from the $i$th group,
- $Z_{ij} = |Y_{ij} - \bar{Y}_{i\cdot}|$, $\bar{Y}_{i\cdot}$ is a mean of the $i$th group,
- $Z_{i\cdot} = \frac{1}{N_i} \sum_{j=1}^{N_i} Z_{ij}$ is the mean of the $Z_{ij}$ for group $i$,
- $Z_{\cdot\cdot} = \frac{1}{N} \sum_{i=1}^k \sum_{j=1}^{N_i} Z_{ij}$ is the mean of all $Z_{ij}$.
The test statistic $W$ is approximately $F$-distributed with $k-1$ and $N-k$ degrees of freedom.
Implements: pvalue
References
- Levene, Howard, "Robust tests for equality of variances". In Ingram Olkin; Harold Hotelling; et al. (eds.). Contributions to Probability and Statistics: Essays in Honor of Harold Hotelling. Stanford University Press. pp. 278–292, 1960
External links
Brown-Forsythe Test
HypothesisTests.BrownForsytheTest
— FunctionBrownForsytheTest(groups::AbstractVector{<:Real}...)
The Brown–Forsythe test is a statistical test for the equality of groups
variances.
The Brown–Forsythe test is a modification of the Levene's test with the median instead of the mean statistic for computing the spread within each group.
Implements: pvalue
References
- Brown, Morton B.; Forsythe, Alan B., "Robust tests for the equality of variances". Journal of the American Statistical Association. 69: 364–367, 1974 doi:10.1080/01621459.1974.10482955.
External links