Scatter Matrix and Covariance

This package implements functions for computing scatter matrix, as well as weighted covariance matrix.

scattermat(X, [wv::AbstractWeights]; mean=nothing, dims=1)

Compute the scatter matrix, which is an unnormalized covariance matrix. A weighting vector wv can be specified to weight the estimate.


  • mean=nothing: a known mean value. nothing indicates that the mean is unknown, and the function will compute the mean. Specifying mean=0 indicates that the data are centered and hence there's no need to subtract the mean.
  • dims=1: the dimension along which the variables are organized. When dims = 1, the variables are considered columns with observations in rows; when dims = 2, variables are in rows with observations in columns.
cov(X, w::AbstractWeights, vardim=1; mean=nothing,  corrected=false)

Compute the weighted covariance matrix. Similar to var and std the biased covariance matrix (corrected=false) is computed by multiplying scattermat(X, w) by $\frac{1}{\sum{w}}$ to normalize. However, the unbiased covariance matrix (corrected=true) is dependent on the type of weights used:

  • AnalyticWeights: $\frac{1}{\sum w - \sum {w^2} / \sum w}$
  • FrequencyWeights: $\frac{1}{\sum{w} - 1}$
  • ProbabilityWeights: $\frac{n}{(n - 1) \sum w}$ where $n$ equals count(!iszero, w)
  • Weights: ArgumentError (bias correction not supported)
cov(ce::CovarianceEstimator, x::AbstractVector; mean=nothing)

Compute a variance estimate from the observation vector x using the estimator ce.

cov(ce::CovarianceEstimator, x::AbstractVector, y::AbstractVector)

Compute the covariance of the vectors x and y using estimator ce.

cov(ce::CovarianceEstimator, X::AbstractMatrix, [w::AbstractWeights]; mean=nothing, dims::Int=1)

Compute the covariance matrix of the matrix X along dimension dims using estimator ce. A weighting vector w can be specified. The keyword argument mean can be:

  • nothing (default) in which case the mean is estimated and subtracted from the data X,
  • a precalculated mean in which case it is subtracted from the data X. Assuming size(X) is (N,M), mean can either be:
    • when dims=1, an AbstractMatrix of size (1,M),
    • when dims=2, an AbstractVector of length N or an AbstractMatrix of size (N,1).
cor(X, w::AbstractWeights, dims=1)

Compute the Pearson correlation matrix of X along the dimension dims with a weighting w .

mean_and_cov(x, [wv::AbstractWeights,] vardim=1; corrected=false) -> (mean, cov)

Return the mean and covariance matrix as a tuple. A weighting vector wv can be specified. vardim that designates whether the variables are columns in the matrix (1) or rows (2). Finally, bias correction is applied to the covariance calculation if corrected=true. See cov documentation for more details.

cov2cor(C, s)

Compute the correlation matrix from the covariance matrix C and a vector of standard deviations s. Use StatsBase.cov2cor! for an in-place version.

cor2cov(C, s)

Compute the covariance matrix from the correlation matrix C and a vector of standard deviations s. Use StatsBase.cor2cov! for an in-place version.


Simple covariance estimator. Estimation calls cov(x; corrected=corrected), cov(x, y; corrected=corrected) or cov(X, w, dims; corrected=corrected) where x, y are vectors, X is a matrix and w is a weighting vector.