Basics

Basics

The package implements a variety of clustering algorithms:

Most of the clustering functions in the package have a similar interface, making it easy to switch between different clustering algorithms.

Inputs

A clustering algorithm, depending on its nature, may accept an input matrix in either of the following forms:

Common Options

Many clustering algorithms are iterative procedures. The functions share the basic options for controlling the iterations:

Results

A clustering function would return an object (typically, an instance of some ClusteringResult subtype) that contains both the resulting clustering (e.g. assignments of points to the clusters) and the information about the clustering algorithm (e.g. the number of iterations and whether it converged).

ClusteringResult

Base type for the output of clustering algorithm.

source

The following generic methods are supported by any subtype of ClusteringResult:

nclusters(R::ClusteringResult) -> Int

Get the number of clusters.

source
StatsBase.countsMethod.
counts(R::ClusteringResult) -> Vector{Int}

Get the vector of cluster sizes.

counts(R)[k] is the number of points assigned to the $k$-th cluster.

source
Clustering.wcountsMethod.
wcounts(R::ClusteringResult) -> Vector{Float64}
wcounts(R::FuzzyCMeansResult) -> Vector{Float64}

Get the weighted cluster sizes as the sum of weights of points assigned to each cluster.

For non-weighted clusterings assumes the weight of every data point is 1.0, so the result is equivalent to convert(Vector{Float64}, counts(R)).

source
assignments(R::ClusteringResult) -> Vector{Int}

Get the vector of cluster indices for each point.

assignments(R)[i] is the index of the cluster to which the $i$-th point is assigned.

source