Statistics Toolbox    
pdist

Pairwise distance between observations.

Syntax

Description

Y = pdist(X) computes the Euclidean distance between pairs of objects in m-by-n matrix X, which is treated as m vectors of size n. For a dataset made up of m objects, there are pairs.

The output, Y, is a vector of length , containing the distance information. The distances are arranged in the order (1,2), (1,3), ..., (1,m), (2,3), ..., (2,m), ..., ..., (m-1,m). Y is also commonly known as a similarity matrix or dissimilarity matrix.

To save space and computation time, Y is formatted as a vector. However, you can convert this vector into a square matrix using the squareform function so that element i,j in the matrix corresponds to the distance between objects i and j in the original dataset.

Y = pdist(X,'metric') computes the distance between objects in the data matrix, X, using the method specified by 'metric', where 'metric' can be any of the following character strings that identify ways to compute the distance.

String
Meaning
'Euclid'
Euclidean distance (default)
'SEuclid'
Standardized Euclidean distance
'Mahal'
Mahalanobis distance
'CityBlock'
City Block metric
'Minkowski'
Minkowski metric

Y = pdist(X,'minkowski',p) computes the distance between objects in the data matrix, X, using the Minkowski metric. p is the exponent used in the Minkowski computation which, by default, is 2.

Mathematical Definitions of Methods

Given an m-by-n data matrix X, which is treated as m (1-by-n) row vectors x1, x2, ..., xm, the various distances between the vector xr and xs are defined as follows:

Examples

See Also

cluster, clusterdata, cophenet, dendrogram, inconsistent, linkage, squareform


 pdf perms