Statistics Toolbox    
cluster

Construct clusters from linkage output.

Syntax

Description

T = cluster(Z,cutoff) constructs clusters from the hierarchical cluster tree, Z, generated by the linkage function. Z is a matrix of size (m-1)-by-3, where m is the number of observations in the original data.

cutoff is a threshold value that determines how the cluster function creates clusters. The value of cutoff determines how cluster interprets it.

Value
Meaning
0 < cutoff < 2
cutoff is interpreted as the threshold for the inconsistency coefficient. The inconsistency coefficient quantifies the degree of difference between objects in the hierarchical cluster tree. If the inconsistency coefficient of a link is greater than the threshold, the cluster function uses the link as a boundary for a cluster grouping. For more information about the inconsistency coefficient, see the inconsistent function.
cutoff >= 2
cutoff is interpreted as the maximum number of clusters to retain in the hierarchical tree.

T = cluster(Z,cutoff,depth,flag) constructs clusters from cluster tree Z. The depth argument specifies the number of levels in the hierarchical cluster tree to include in the inconsistency coefficient computation. (The inconsistency coefficient compares a link between two objects in the cluster tree with neighboring links up to a specified depth. See the inconsistent function for more information.) When the depth argument is specified, cutoff is always interpreted as the inconsistency coefficient threshold.

The flag argument overrides the default meaning of the cutoff argument. If flag is 'inconsistent', then cutoff is interpreted as a threshold for the inconsistency coefficient. If flag is 'clusters', then cutoff is the maximum number of clusters.

The output, T, is a vector of size m that identifies, by number, the cluster in which each object was grouped. To find out which objects from the original dataset are contained in cluster i, use find(T==i).

Example

The example uses the pdist function to calculate the distance between items in a matrix of random numbers and then uses the linkage function to compute the hierarchical cluster tree based on the matrix. The output of the linkage function is passed to the cluster function. The cutoff value 3 indicates that you want to group the items into three clusters. The example uses the find function to list all the items grouped into cluster 2.

See Also

clusterdata, cophenet, dendrogram, inconsistent, linkage, pdist, squareform


 classify clusterdata