Independent Component Analysis
Independent Component Analysis (ICA) is a computational technique for separating a multivariate signal into additive subcomponents, with the assumption that the subcomponents are non-Gaussian and independent from each other.
There are multiple algorithms for ICA. Currently, this package implements the Fast ICA algorithm.
FastICA
This package implements the FastICA algorithm[1]. The package uses the ICA type to define a FastICA model:
MultivariateStats.ICA — TypeThis type contains ICA model parameters: mean and component matrix $W$.
Note: Each column of the component matrix $W$ corresponds to an independent component.
Several methods are provided to work with ICA. Let $M$ be an instance of ICA:
StatsAPI.fit — Methodfit(ICA, X, k; ...)Perform ICA over the data set given in X.
Parameters: -X: The data matrix, of size $(m, n)$. Each row corresponds to a mixed signal, while each column corresponds to an observation (e.g all signal value at a particular time step). -k: The number of independent components to recover.
Keyword Arguments:
alg: The choice of algorithm (default:fastica)fun: The approx neg-entropy functor (defaultTanh)do_whiten: Whether to perform pre-whitening (defaulttrue)maxiter: Maximum number of iterations (default100)tol: Tolerable change of $W$ at convergence (default1.0e-6)mean: The mean vector, which can be either of:0: the input data has already been centralizednothing: this function will compute the mean (default)- a pre-computed mean vector
winit: Initial guess of $W$, which should be either of:- empty matrix: the function will perform random initialization (default)
- a matrix of size $(k, k)$ (when
do_whiten) - a matrix of size $(m, k)$ (when
!do_whiten)
Returns the resultant ICA model, an instance of type ICA.
Note: If do_whiten is true, the return W satisfies $\mathbf{W}^T \mathbf{C} \mathbf{W} = \mathbf{I}$, otherwise $W$ is orthonormal, i.e $\mathbf{W}^T \mathbf{W} = \mathbf{I}$.
Base.size — Methodsize(M::ICA)Returns a tuple with the input dimension, i.e the number of observed mixtures, and the output dimension, i.e the number of independent components.
Statistics.mean — Methodmean(M::ICA)Returns the mean vector.
StatsAPI.predict — Methodpredict(M::ICA, x)Transform x to the output space to extract independent components, as $\mathbf{W}^T (\mathbf{x} - \boldsymbol{\mu})$, given the model M.
The package also exports functions of the core algorithms. Sometimes, it can be more efficient to directly invoke them instead of going through the fit interface.
MultivariateStats.fastica! — Functionfastica!(W, X, fun, maxiter, tol, verbose)Invoke the Fast ICA algorithm[1].
Parameters:
W: The initial un-mixing matrix, of size $(m, k)$. The function updates this matrix inplace.X: The data matrix, of size $(m, n)$. This matrix is input only, and won't be modified.fun: The approximate neg-entropy functor of typeICAGDeriv.maxiter: Maximum number of iterations.tol: Tolerable change ofWat convergence.
Returns the updated W.
Note: The number of components is inferred from W as size(W, 2).
The FastICA method requires a first derivative of a functor $g$ to approximate negative entropy. The package implements an following interface for defining derivative value estimation:
MultivariateStats.ICAGDeriv — TypeThe abstract type for all g (derivative) functions.
Let g be an instance of such type, then update!(g, U, E) given
U = w'x
returns updated in-place U and E, s.t.
g(w'x) --> UandE{g'(w'x)} --> E
MultivariateStats.Tanh — TypeDerivative for $(1/a_1)\log\cosh a_1 u$
MultivariateStats.Gaus — TypeDerivative for $-e^{\frac{-u^2}{2}}$
References
- 1Aapo Hyvarinen and Erkki Oja, Independent Component Analysis: Algorithms and Applications. Neural Network 13(4-5), 2000.