Nonparametric tests

Nonparametric tests

Anderson-Darling test

Available are both one-sample and $k$-sample tests.

OneSampleADTest(x::AbstractVector{<:Real}, d::UnivariateDistribution)

Perform a one-sample Anderson–Darling test of the null hypothesis that the data in vector x come from the distribution d against the alternative hypothesis that the sample is not drawn from d.

Implements: pvalue

source
KSampleADTest(xs::AbstractVector{<:Real}...; modified = true, nsim = 0)

Perform a $k$-sample Anderson–Darling test of the null hypothesis that the data in the $k$ vectors xs come from the same distribution against the alternative hypothesis that the samples come from different distributions.

modified parameter enables a modified test calculation for samples whose observations do not all coincide.

If nsim is equal to 0 (the default) the asymptotic calculation of p-value is used. If it is greater than 0, an estimation of p-values is used by generating nsim random splits of the pooled data on $k$ samples, evaluating the AD statistics for each split, and computing the proportion of simulated values which are greater or equal to observed. This proportion is reported as p-value estimate.

Implements: pvalue

References

  • F. W. Scholz and M. A. Stephens, K-Sample Anderson-Darling Tests, Journal of the American Statistical Association, Vol. 82, No. 399. (Sep., 1987), pp. 918-924.
source

Binomial test

BinomialTest(x::Integer, n::Integer, p::Real = 0.5)
BinomialTest(x::AbstractVector{Bool}, p::Real = 0.5)

Perform a binomial test of the null hypothesis that the distribution from which x successes were encountered in n draws (or alternatively from which the vector x was drawn) has success probability p against the alternative hypothesis that the success probability is not equal to p.

Computed confidence intervals (confint) by default are Clopper-Pearson intervals.

Implements: pvalue, confint(::BinomialTest)

source

Fisher exact test

FisherExactTest(a::Integer, b::Integer, c::Integer, d::Integer)

Perform Fisher's exact test of the null hypothesis that the success probabilities $a/c$ and $b/d$ are equal, that is the odds ratio $(a/c) / (b/d)$ is one, against the alternative hypothesis that they are not equal.

See pvalue(::FisherExactTest) and confint(::FisherExactTest) for details about the computation of the default p-value and confidence interval, respectively.

The contingency table is structured as:

-X1X2
Y1ab
Y2cd
Note

The show function output contains the conditional maximum likelihood estimate of the odds ratio rather than the sample odds ratio; it maximizes the likelihood given by Fisher's non-central hypergeometric distribution.

Implements: pvalue(::FisherExactTest), confint(::FisherExactTest)

References

  • Fay, M.P., Supplementary material to "Confidence intervals that match Fisher’s exact or Blaker’s exact tests". Biostatistics, Volume 11, Issue 2, 1 April 2010, Pages 373–374, link
source

Kolmogorov-Smirnov test

Available are an exact one-sample test and approximate (i.e. asymptotic) one- and two-sample tests.

ExactOneSampleKSTest(x::AbstractVector{<:Real}, d::UnivariateDistribution)

Perform a one-sample exact Kolmogorov–Smirnov test of the null hypothesis that the data in vector x comes from the distribution d against the alternative hypothesis that the sample is not drawn from d.

Implements: pvalue

source
ApproximateOneSampleKSTest(x::AbstractVector{<:Real}, d::UnivariateDistribution)

Perform an asymptotic one-sample Kolmogorov–Smirnov test of the null hypothesis that the data in vector x comes from the distribution d against the alternative hypothesis that the sample is not drawn from d.

Implements: pvalue

source
ApproximateTwoSampleKSTest(x::AbstractVector{<:Real}, y::AbstractVector{<:Real})

Perform an asymptotic two-sample Kolmogorov–Smirnov-test of the null hypothesis that x and y are drawn from the same distribution against the alternative hypothesis that they come from different distributions.

Implements: pvalue

External links

source

Kruskal-Wallis rank sum test

KruskalWallisTest(groups::AbstractVector{<:Real}...)

Perform Kruskal-Wallis rank sum test of the null hypothesis that the groups $\mathcal{G}$ come from the same distribution against the alternative hypothesis that that at least one group stochastically dominates one other group.

The Kruskal-Wallis test is an extension of the Mann-Whitney U test to more than two groups.

The p-value is computed using a $χ^2$ approximation to the distribution of the test statistic $H_c=\frac{H}{C}$:

\[ \begin{align*} H & = \frac{12}{n(n+1)} \sum_{g ∈ \mathcal{G}} \frac{R_g^2}{n_g} - 3(n+1)\\ C & = 1-\frac{1}{n^3-n}\sum_{t ∈ \mathcal{T}} (t^3-t), \end{align*}\]

where $\mathcal{T}$ is the set of the counts of tied values at each tied position, $n$ is the total number of observations across all groups, and $n_g$ and $R_g$ are the number of observations and the rank sum in group $g$, respectively. See references for further details.

Implements: pvalue

References

  • Meyer, J.P, Seaman, M.A., Expanded tables of critical values for the Kruskal-Wallis H statistic. Paper presented at the annual meeting of the American Educational Research Association, San Francisco, April 2006.

External links

source

Mann-Whitney U test

MannWhitneyUTest(x::AbstractVector{<:Real}, y::AbstractVector{<:Real})

Perform a Mann-Whitney U test of the null hypothesis that the probability that an observation drawn from the same population as x is greater than an observation drawn from the same population as y is equal to the probability that an observation drawn from the same population as y is greater than an observation drawn from the same population as x against the alternative hypothesis that these probabilities are not equal.

The Mann-Whitney U test is sometimes known as the Wilcoxon rank-sum test.

When there are no tied ranks and ≤50 samples, or tied ranks and ≤10 samples, MannWhitneyUTest performs an exact Mann-Whitney U test. In all other cases, MannWhitneyUTest performs an approximate Mann-Whitney U test. Behavior may be further controlled by using ExactMannWhitneyUTest or ApproximateMannWhitneyUTest directly.

Implements: pvalue

source
ExactMannWhitneyUTest(x::AbstractVector{<:Real}, y::AbstractVector{<:Real})

Perform an exact Mann-Whitney U test of the null hypothesis that the probability that an observation drawn from the same population as x is greater than an observation drawn from the same population as y is equal to the probability that an observation drawn from the same population as y is greater than an observation drawn from the same population as x against the alternative hypothesis that these probabilities are not equal.

When there are no tied ranks, the exact p-value is computed using the pwilcox function from the Rmath package. In the presence of tied ranks, a p-value is computed by exhaustive enumeration of permutations, which can be very slow for even moderately sized data sets.

Implements: pvalue

source
ApproximateMannWhitneyUTest(x::AbstractVector{<:Real}, y::AbstractVector{<:Real})

Perform an approximate Mann-Whitney U test of the null hypothesis that the probability that an observation drawn from the same population as x is greater than an observation drawn from the same population as y is equal to the probability that an observation drawn from the same population as y is greater than an observation drawn from the same population as x against the alternative hypothesis that these probabilities are not equal.

The p-value is computed using a normal approximation to the distribution of the Mann-Whitney U statistic:

\[ \begin{align*} μ & = \frac{n_x n_y}{2}\\ σ & = \frac{n_x n_y}{12}\left(n_x + n_y + 1 - \frac{a}{(n_x + n_y)(n_x + n_y - 1)}\right)\\ a & = \sum_{t \in \mathcal{T}} t^3 - t \end{align*}\]

where $\mathcal{T}$ is the set of the counts of tied values at each tied position.

Implements: pvalue

source

Sign test

SignTest(x::AbstractVector{T<:Real}, median::Real = 0)
SignTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real}, median::Real = 0)

Perform a sign test of the null hypothesis that the distribution from which x (or x - y if y is provided) was drawn has median median against the alternative hypothesis that the median is not equal to median.

Implements: pvalue, confint

source

Wald-Wolfowitz independence test

WaldWolfowitzTest(x::AbstractVector{Bool})
WaldWolfowitzTest(x::AbstractVector{<:Real})

Perform the Wald-Wolfowitz (or Runs) test of the null hypothesis that the given data is random, or independently sampled. The data can come as many-valued or two-valued (Boolean). If many-valued, the sample is transformed by labelling each element as above or below the median.

Implements: pvalue

source

Wilcoxon signed rank test

SignedRankTest(x::AbstractVector{<:Real})
SignedRankTest(x::AbstractVector{<:Real}, y::AbstractVector{<:Real})

Perform a Wilcoxon signed rank test of the null hypothesis that the distribution of x (or the difference x - y if y is provided) has zero median against the alternative hypothesis that the median is non-zero.

When there are no tied ranks and ≤50 samples, or tied ranks and ≤15 samples, SignedRankTest performs an exact signed rank test. In all other cases, SignedRankTest performs an approximate signed rank test. Behavior may be further controlled by using ExactSignedRankTest or ApproximateSignedRankTest directly.

Implements: pvalue, confint

source
ExactSignedRankTest(x::AbstractVector{<:Real}[, y::AbstractVector{<:Real}])

Perform a Wilcoxon exact signed rank U test of the null hypothesis that the distribution of x (or the difference x - y if y is provided) has zero median against the alternative hypothesis that the median is non-zero.

When there are no tied ranks, the exact p-value is computed using the psignrank function from the Rmath package. In the presence of tied ranks, a p-value is computed by exhaustive enumeration of permutations, which can be very slow for even moderately sized data sets.

Implements: pvalue, confint

source
ApproximateSignedRankTest(x::AbstractVector{<:Real}[, y::AbstractVector{<:Real}])

Perform a Wilcoxon approximate signed rank U test of the null hypothesis that the distribution of x (or the difference x - y if y is provided) has zero median against the alternative hypothesis that the median is non-zero.

The p-value is computed using a normal approximation to the distribution of the signed rank statistic:

\[ \begin{align*} μ & = \frac{n(n + 1)}{4}\\ σ & = \frac{n(n + 1)(2 * n + 1)}{24} - \frac{a}{48}\\ a & = \sum_{t \in \mathcal{T}} t^3 - t \end{align*}\]

where $\mathcal{T}$ is the set of the counts of tied values at each tied position.

Implements: pvalue, confint

source

Permutation test

ExactPermutationTest(x::Vector, y::Vector, f::Function)

Perform a permutation test (a.k.a. randomization test) of the null hypothesis that f(x) is equal to f(y). All possible permutations are sampled.

source
ApproximatePermutationTest(x::Vector, y::Vector, f::Function, n::Int)

Perform a permutation test (a.k.a. randomization test) of the null hypothesis that f(x) is equal to f(y). n of the factorial(length(x)+length(y)) permutations are sampled at random.

source