Factor Analysis

Factor Analysis (FA) is a linear-Gaussian latent variable model that is closely related to probabilistic PCA. In contrast to the probabilistic PCA model, the covariance of conditional distribution of the observed variable given the latent variable is diagonal rather than isotropic[1].

This package defines a FactorAnalysis type to represent a factor analysis model, and provides a set of methods to access the properties.

The package provides a set of methods to access the properties of the factor analysis model. Let $M$ be an instance of FactorAnalysis, $d$ be the dimension of observations, and $p$ be the output dimension (i.e the dimension of the principal subspace).

StatsAPI.fitMethod
fit(FactorAnalysis, X; ...)

Perform factor analysis over the data given in a matrix X. Each column of X is an observation. This method returns an instance of FactorAnalysis.

Keyword arguments:

Let (d, n) = size(X) be respectively the input dimension and the number of observations:

  • method: The choice of methods:
    • :em: use EM version of factor analysis
    • :cm: use CM version of factor analysis (default)
  • maxoutdim: Maximum output dimension (default d-1)
  • mean: The mean vector, which can be either of:
    • 0: the input data has already been centralized
    • nothing: this function will compute the mean (default)
    • a pre-computed mean vector
  • tol: Convergence tolerance (default 1.0e-6)
  • maxiter: Maximum number of iterations (default 1000)
  • η: Variance low bound (default 1.0e-6)

Notes: This function calls facm or faem internally, depending on the choice of method.

source
Base.sizeMethod
size(M::FactorAnalysis)

Returns a tuple with values of the input dimension $d$, i.e the dimension of the observation space, and the output dimension $p$, i.e the dimension of the principal subspace.

source

Given a factor analysis model $M$, one can use it to transform observations into latent variables, as

\mathbf{z} =  \mathbf{W}^T \mathbf{\Sigma}^{-1} (\mathbf{x} - \boldsymbol{\mu})

or use it to reconstruct (approximately) the observations from latent variables, as

\tilde{\mathbf{x}} = \mathbf{\Sigma} \mathbf{W} (\mathbf{W}^T \mathbf{W})^{-1} \mathbf{z} + \boldsymbol{\mu}

Here, $\mathbf{W}$ is the factor loadings or weight matrix, $\mathbf{\Sigma} = \mathbf{\Psi} + \mathbf{W}^T \mathbf{W}$ is the covariance matrix.

The package provides methods to do so:

StatsAPI.predictMethod
predict(M::FactorAnalysis, x)

Transform observations x into latent variables. Here, x can be either a vector of length d or a matrix where each column is an observation.

source
MultivariateStats.reconstructMethod
reconstruct(M::FactorAnalysis, z)

Approximately reconstruct observations from the latent variable given in z. Here, z can be either a vector of length $p$ or a matrix where each column gives the latent variables for an observation.

source

Auxiliary functions:

MultivariateStats.faemFunction
faem(S, mean, n; ...)

Performs factor analysis using an expectation-maximization algorithm for a given sample covariance matrix S[2].

Parameters

  • S: The sample covariance matrix.
  • mean: The mean vector of original samples, which can be a vector of length $d$,

or an empty vector indicating a zero mean.

  • n: The number of observations.

Returns the resultant FactorAnalysis model.

Note: This function accepts two keyword arguments: maxoutdim,tol, and maxiter.

source
MultivariateStats.facmFunction
facm(S, mean, n; ...)

Performs factor analysis using a fast conditional maximization algorithm for a given sample covariance matrix S[3].

Parameters

  • S: The sample covariance matrix.
  • mean: The mean vector of original samples, which can be a vector of length $d$,

or an empty vector indicating a zero mean.

  • n: The number of observations.

Returns the resultant FactorAnalysis model.

Note: This function accepts two keyword arguments: maxoutdim,tol, maxiter, and η.

source

References

  • 1Bishop, C. M. Pattern Recognition and Machine Learning, 2006.
  • 2Rubin, Donald B., and Dorothy T. Thayer. EM algorithms for ML factor analysis. Psychometrika 47.1, 69-76, 1982.
  • 3Zhao, J-H., Philip LH Yu, and Qibao Jiang. ML estimation for factor analysis: EM or non-EM?. Statistics and computing 18.2, 109-123, 2008.