Statistics Toolbox | ![]() ![]() |
Example: Inconsistent Links. To illustrate, the following example creates a data set of random numbers with three deliberate natural groupings. In the dendrogram, note how the objects tend to collect into three groups. These three groups are then connected by three longer links. These longer links are inconsistent when compared with the links below them in the hierarchy.
rand('seed',3) X = [rand(10,2)+1;rand(10,2)+2;rand(10,2)+3]; Y = pdist(X); Z = linkage(Y); dendrogram(Z);
The relative consistency of each link in a hierarchical cluster tree can be quantified and expressed as the inconsistency coefficient. This value compares the length of a link in a cluster hierarchy with the average length of neighboring links. If the object is consistent with those around it, it will have a low inconsistency coefficient. If the object is inconsistent with those around it, it will have a higher inconsistency coefficient.
To generate a listing of the inconsistency coefficient for each link the cluster tree, use the inconsistent
function. The inconsistent
function compares each link in the cluster hierarchy with adjacent links two levels below it in the cluster hierarchy. This is called the depth of the comparison. Using the inconsistent
function, you can specify other depths. The objects at the bottom of the cluster tree, called leaf nodes, that have no further objects below them, have an inconsistency coefficient of zero.
For example, returning to the sample data set of x and y coordinates, we can use the inconsistent
function to calculate the inconsistency values for the links created by the linkage
function, described in Defining the Links Between Objects.
I = inconsistent(Z) I = 1.0000 0 1.0000 0 1.0000 0 1.0000 0 1.3539 0.8668 3.0000 0.8165 2.2808 0.3100 2.0000 0.7071
The inconsistent
function returns data about the links in an (m-1)-by-4 matrix where each column provides data about the links.
In the sample output, the first row represents the link between objects 1 and 3. (This cluster is assigned the index 6 by the linkage
function.) Because this a leaf node, the inconsistency coefficient is zero. The second row represents the link between objects 4 and 5, also a leaf node. (This cluster is assigned the index 7 by the linkage function.)
The third row evaluates the link that connects these two leaf nodes, objects 6 and 7. (This cluster is called object 8 in the linkage
output). Column three indicates that three links are considered in the calculation: the link itself and the two links directly below it in the hierarchy. Column one represents the mean of the lengths of these links. The inconsistent
function uses the length information output by the linkage
function to calculate the mean. Column two represents the standard deviation between the links. The last column contains the inconsistency value for these links, 0.8165.
The following figure illustrates the links and lengths included in this calculation.
Row four in the output matrix describes the link between object 8 and object 2. Column three indicates that two links are included in this calculation: the link itself and the link directly below it in the hierarchy. The inconsistency coefficient for this link is 0.7071.
The following figure illustrates the links and lengths included in this calculation.
![]() | Getting More Information About Cluster Links | Creating Clusters | ![]() |