Statistics Toolbox | ![]() ![]() |
Syntax
T = clusterdata(X,cutoff)
Description
constructs clusters from the data matrix T = clusterdata(X,cutoff)
X
. X
is a matrix of size m by n, interpreted as m observations of n variables.
cutoff
is a threshold value that determines how the cluster
function creates clusters. The value of cutoff
determines how clusterdata
interprets it
Value |
Meaning |
0 < cutoff < 1 |
cutoff is interpreted as the threshold for the inconsistency coefficient. The inconsistency coefficient quantifies the degree of difference between objects in the hierarchical cluster tree. If the inconsistency coefficient of a link is greater than the threshold, the cluster function uses the link as a boundary for a cluster grouping. For more information about the inconsistency coefficient, see the inconsistent function. |
cutoff >= 1 |
cutoff is interpreted as the maximum number of clusters to retain in the hierarchical tree. |
The output, T
, is a vector of size m that identifies, by number, the cluster in which each object was grouped.
T = clusterdata(X,cutoff)
is the same as
Y = pdist(X,'euclid'); Z = linkage(Y,'single'); T = cluster(Z,cutoff);
Follow this sequence to use nondefault parameters for pdist
and linkage
.
Example
The example first creates a sample dataset of random numbers. The example then uses the clusterdata
function to compute the distances between items in the dataset and create a hierarchical cluster tree from the dataset. Finally, the clusterdata
function groups the items in the dataset into three clusters. The example uses the find
function to list all the items in cluster 2.
rand('seed',12); X = [rand(10,3); rand(10,3)+1.2; rand(10,3)+2.5]; T = clusterdata(X,3); find(T==2) ans = 21 22 23 24 25 26 27 28 29 30
See Also
cluster
, cophenet
, dendrogram
, inconsistent
, linkage
, pdist
, squareform
![]() | cluster | combnk | ![]() |