cophenet#
- scipy.cluster.hierarchy.cophenet(Z, Y=None)[source]#
- Calculate the cophenetic distances between each observation in the hierarchical clustering defined by the linkage - Z.- Suppose - pand- qare original observations in disjoint clusters- sand- t, respectively and- sand- tare joined by a direct parent cluster- u. The cophenetic distance between observations- iand- jis simply the distance between clusters- sand- t.- Parameters:
- Zndarray
- The hierarchical clustering encoded as an array (see - linkagefunction).
- Yndarray (optional)
- Calculates the cophenetic correlation coefficient - cof a hierarchical clustering defined by the linkage matrix Z of a set of \(n\) observations in \(m\) dimensions. Y is the condensed distance matrix from which Z was generated.
 
- Returns:
- cndarray
- The cophentic correlation distance (if - Yis passed).
- dndarray
- The cophenetic distance matrix in condensed form. The \(ij\) th entry is the cophenetic distance between original observations \(i\) and \(j\). 
 
 - See also - linkage
- for a description of what a linkage matrix is. 
- scipy.spatial.distance.squareform
- transforming condensed matrices into square ones. 
 - Examples - >>> from scipy.cluster.hierarchy import single, cophenet >>> from scipy.spatial.distance import pdist, squareform - Given a dataset - Xand a linkage matrix- Z, the cophenetic distance between two points of- Xis the distance between the largest two distinct clusters that each of the points:- >>> X = [[0, 0], [0, 1], [1, 0], ... [0, 4], [0, 3], [1, 4], ... [4, 0], [3, 0], [4, 1], ... [4, 4], [3, 4], [4, 3]] - Xcorresponds to this dataset- x x x x x x x x x x x x - >>> Z = single(pdist(X)) >>> Z array([[ 0., 1., 1., 2.], [ 2., 12., 1., 3.], [ 3., 4., 1., 2.], [ 5., 14., 1., 3.], [ 6., 7., 1., 2.], [ 8., 16., 1., 3.], [ 9., 10., 1., 2.], [11., 18., 1., 3.], [13., 15., 2., 6.], [17., 20., 2., 9.], [19., 21., 2., 12.]]) >>> cophenet(Z) array([1., 1., 2., 2., 2., 2., 2., 2., 2., 2., 2., 1., 2., 2., 2., 2., 2., 2., 2., 2., 2., 2., 2., 2., 2., 2., 2., 2., 2., 2., 1., 1., 2., 2., 2., 2., 2., 2., 1., 2., 2., 2., 2., 2., 2., 2., 2., 2., 2., 2., 2., 1., 1., 2., 2., 2., 1., 2., 2., 2., 2., 2., 2., 1., 1., 1.]) - The output of the - scipy.cluster.hierarchy.cophenetmethod is represented in condensed form. We can use- scipy.spatial.distance.squareformto see the output as a regular matrix (where each element- ijdenotes the cophenetic distance between each- i,- jpair of points in- X):- >>> squareform(cophenet(Z)) array([[0., 1., 1., 2., 2., 2., 2., 2., 2., 2., 2., 2.], [1., 0., 1., 2., 2., 2., 2., 2., 2., 2., 2., 2.], [1., 1., 0., 2., 2., 2., 2., 2., 2., 2., 2., 2.], [2., 2., 2., 0., 1., 1., 2., 2., 2., 2., 2., 2.], [2., 2., 2., 1., 0., 1., 2., 2., 2., 2., 2., 2.], [2., 2., 2., 1., 1., 0., 2., 2., 2., 2., 2., 2.], [2., 2., 2., 2., 2., 2., 0., 1., 1., 2., 2., 2.], [2., 2., 2., 2., 2., 2., 1., 0., 1., 2., 2., 2.], [2., 2., 2., 2., 2., 2., 1., 1., 0., 2., 2., 2.], [2., 2., 2., 2., 2., 2., 2., 2., 2., 0., 1., 1.], [2., 2., 2., 2., 2., 2., 2., 2., 2., 1., 0., 1.], [2., 2., 2., 2., 2., 2., 2., 2., 2., 1., 1., 0.]]) - In this example, the cophenetic distance between points on - Xthat are very close (i.e., in the same corner) is 1. For other pairs of points is 2, because the points will be located in clusters at different corners - thus, the distance between these clusters will be larger.