Statistics Toolbox | ![]() ![]() |
Construct clusters from linkage
output.
Syntax
T = cluster(Z,cutoff) T = cluster(Z,cutoff,depth,flag)
Description
T =
constructs clusters from the hierarchical cluster tree, cluster(Z,cutoff)
Z
, generated by the linkage
function. Z
is a matrix of size (m-1)-by-3, where m is the number of observations in the original data.
cutoff
is a threshold value that determines how the cluster
function creates clusters. The value of cutoff
determines how cluster
interprets it.
Value |
Meaning |
0 < cutoff < 2 |
cutoff is interpreted as the threshold for the inconsistency coefficient. The inconsistency coefficient quantifies the degree of difference between objects in the hierarchical cluster tree. If the inconsistency coefficient of a link is greater than the threshold, the cluster function uses the link as a boundary for a cluster grouping. For more information about the inconsistency coefficient, see the inconsistent function. |
cutoff >= 2 |
cutoff is interpreted as the maximum number of clusters to retain in the hierarchical tree. |
T =
constructs clusters from cluster tree cluster(Z,cutoff,depth,flag)
Z
. The depth
argument specifies the number of levels in the hierarchical cluster tree to include in the inconsistency coefficient computation. (The inconsistency coefficient compares a link between two objects in the cluster tree with neighboring links up to a specified depth. See the inconsistent
function for more information.) When the depth
argument is specified, cutoff
is always interpreted as the inconsistency coefficient threshold.
The flag
argument overrides the default meaning of the cutoff
argument. If flag
is 'inconsistent'
, then cutoff
is interpreted as a threshold for the inconsistency coefficient. If flag
is 'clusters'
, then cutoff
is the maximum number of clusters.
The output, T
, is a vector of size m that identifies, by number, the cluster in which each object was grouped. To find out which objects from the original dataset are contained in cluster i, use find(T==i)
.
Example
The example uses the pdist
function to calculate the distance between items in a matrix of random numbers and then uses the linkage
function to compute the hierarchical cluster tree based on the matrix. The output of the linkage
function is passed to the cluster
function. The cutoff
value 3
indicates that you want to group the items into three clusters. The example uses the find
function to list all the items grouped into cluster 2.
rand('seed', 0); X = [rand(10,3); rand(10,3)+1; rand(10,3)+2]; Y = pdist(X); Z = linkage(Y); T = cluster(Z,3); find(T==3) ans = 11 12 13 14 15 16 17 18 19 20
See Also
clusterdata
, cophenet
, dendrogram
, inconsistent
, linkage
, pdist
, squareform
![]() | classify | clusterdata | ![]() |