Methods

Methods

Confidence interval

StatsBase.confintFunction.
confint(test::HypothesisTest; alpha = 0.05, tail = :both)

Compute a confidence interval C with coverage 1-alpha.

If tail is :both (default), then a two-sided confidence interval is returned. If tail is :left or :right, then a one-sided confidence interval is returned.

Note

Most of the implemented confidence intervals are strongly consistent, that is, the confidence interval with coverage 1-alpha does not contain the test statistic under $h_0$ if and only if the corresponding test rejects the null hypothesis $h_0: θ = θ_0$:

\[ C (x, 1 − α) = \{θ : p_θ (x) > α\},\]

where $p_θ$ is the pvalue of the corresponding test.

source
StatsBase.confintMethod.
confint(test::BinomialTest; level = 0.95, tail = :both, method = :clopper_pearson)

Compute a confidence interval with coverage level for a binomial proportion using one of the following methods. Possible values for method are:

  • :clopper_pearson (default): Clopper-Pearson interval is based on the binomial distribution. The empirical coverage is never less than the nominal coverage of 1-alpha; it is usually too conservative.
  • :wald: Wald (or normal approximation) interval relies on the standard approximation of the actual binomial distribution by a normal distribution. Coverage can be erratically poor for success probabilities close to zero or one.
  • :wilson: Wilson score interval relies on a normal approximation. In contrast to :wald, the standard deviation is not approximated by an empirical estimate, resulting in good empirical coverages even for small numbers of draws and extreme success probabilities.
  • :jeffrey: Jeffreys interval is a Bayesian credible interval obtained by using a non-informative Jeffreys prior. The interval is very similar to the Wilson interval.
  • :agresti_coull: Agresti-Coull interval is a simplified version of the Wilson interval; both are centered around the same value. The Agresti Coull interval has higher or equal coverage.
  • :arcsine: Confidence interval computed using the arcsine transformation to make $var(p)$ independent of the probability $p$.

References

  • Brown, L.D., Cai, T.T., and DasGupta, A. Interval estimation for a binomial proportion. Statistical Science, 16(2):101–117, 2001.

External links

source
StatsBase.confintMethod.
confint(test::PowerDivergenceTest; alpha = 0.05, tail = :both, method = :auto)

Compute a confidence interval with coverage level for multinomial proportions using one of the following methods. Possible values for method are:

  • :auto (default): If the minimum of the expected cell counts exceeds 100, Quesenberry-Hurst intervals are used, otherwise Sison-Glaz.
  • :sison_glaz: Sison-Glaz intervals
  • :bootstrap: Bootstrap intervals
  • :quesenberry_hurst: Quesenberry-Hurst intervals
  • :gold: Gold intervals (asymptotic simultaneous intervals)

References

  • Agresti, Alan. Categorical Data Analysis, 3rd Edition. Wiley, 2013.
  • Sison, C.P and Glaz, J. Simultaneous confidence intervals and sample size determination for multinomial proportions. Journal of the American Statistical Association, 90:366-369, 1995.
  • Quesensberry, C.P. and Hurst, D.C. Large Sample Simultaneous Confidence Intervals for Multinational Proportions. Technometrics, 6:191-195, 1964.
  • Gold, R. Z. Tests Auxiliary to $χ^2$ Tests in a Markov Chain. Annals of Mathematical Statistics, 30:56-74, 1963.
source
StatsBase.confintMethod.
confint(x::FisherExactTest; level::Float64=0.95, tail=:both, method=:central)

Compute a confidence interval with coverage level. One-sided intervals are based on Fisher's non-central hypergeometric distribution. For tail = :both, the only method implemented yet is the central interval (:central).

Note

Since the p-value is not necessarily unimodal, the corresponding confidence region might not be an interval.

References

  • Gibbons, J.D, Pratt, J.W. P-values: Interpretation and Methodology, American Statistican, 29(1):20-25, 1975.
  • Fay, M.P., Supplementary material to "Confidence intervals that match Fisher’s exact or Blaker’s exact tests". Biostatistics, Volume 11, Issue 2, 1 April 2010, Pages 373–374, link
source

p-value

pvalue(test::HypothesisTest; tail = :both)

Compute the p-value for a given significance test.

If tail is :both (default), then the p-value for the two-sided test is returned. If tail is :left or :right, then a one-sided test is performed.

source
pvalue(x::FisherExactTest; tail = :both, method = :central)

Compute the p-value for a given Fisher exact test.

The one-sided p-values are based on Fisher's non-central hypergeometric distribution $f_ω(i)$ with odds ratio $ω$:

\[ \begin{align*} p_ω^{(\text{left})} &=\sum_{i ≤ a} f_ω(i)\\ p_ω^{(\text{right})} &=\sum_{i ≥ a} f_ω(i) \end{align*}\]

For tail = :both, possible values for method are:

  • :central (default): Central interval, i.e. the p-value is two times the minimum of the one-sided p-values.
  • :minlike: Minimum likelihood interval, i.e. the p-value is computed by summing all tables with the same marginals that are equally or less probable:
    \[ p_ω = \sum_{f_ω(i)≤ f_ω(a)} f_ω(i)\]

References

  • Gibbons, J.D., Pratt, J.W., P-values: Interpretation and Methodology, American Statistican, 29(1):20-25, 1975.
  • Fay, M.P., Supplementary material to "Confidence intervals that match Fisher’s exact or Blaker’s exact tests". Biostatistics, Volume 11, Issue 2, 1 April 2010, Pages 373–374, link
source