rankdata#
- scipy.stats.rankdata(a, method='average', *, axis=None, nan_policy='propagate')[source]#
- Assign ranks to data, dealing with ties appropriately. - By default ( - axis=None), the data array is first flattened, and a flat array of ranks is returned. Separately reshape the rank array to the shape of the data array if desired (see Examples).- Ranks begin at 1. The method argument controls how ranks are assigned to equal values. See [1] for further discussion of ranking methods. - Parameters:
- aarray_like
- The array of values to be ranked. 
- method{‘average’, ‘min’, ‘max’, ‘dense’, ‘ordinal’}, optional
- The method used to assign ranks to tied elements. The following methods are available (default is ‘average’): - ‘average’: The average of the ranks that would have been assigned to all the tied values is assigned to each value. 
- ‘min’: The minimum of the ranks that would have been assigned to all the tied values is assigned to each value. (This is also referred to as “competition” ranking.) 
- ‘max’: The maximum of the ranks that would have been assigned to all the tied values is assigned to each value. 
- ‘dense’: Like ‘min’, but the rank of the next highest element is assigned the rank immediately after those assigned to the tied elements. 
- ‘ordinal’: All values are given a distinct rank, corresponding to the order that the values occur in a. 
 
- axis{None, int}, optional
- Axis along which to perform the ranking. If - None, the data array is first flattened.
- nan_policy{‘propagate’, ‘omit’, ‘raise’}, optional
- Defines how to handle when input contains nan. The following options are available (default is ‘propagate’): - ‘propagate’: propagates nans through the rank calculation 
- ‘omit’: performs the calculations ignoring nan values 
- ‘raise’: raises an error 
 - Note - When nan_policy is ‘propagate’, the output is an array of all nans because ranks relative to nans in the input are undefined. When nan_policy is ‘omit’, nans in a are ignored when ranking the other values, and the corresponding locations of the output are nan. - Added in version 1.10. 
 
- Returns:
- ranksndarray
- An array of size equal to the size of a, containing rank scores. 
 
 - References [1]- “Ranking”, https://en.wikipedia.org/wiki/Ranking - Examples - >>> import numpy as np >>> from scipy.stats import rankdata >>> rankdata([0, 2, 3, 2]) array([ 1. , 2.5, 4. , 2.5]) >>> rankdata([0, 2, 3, 2], method='min') array([ 1, 2, 4, 2]) >>> rankdata([0, 2, 3, 2], method='max') array([ 1, 3, 4, 3]) >>> rankdata([0, 2, 3, 2], method='dense') array([ 1, 2, 3, 2]) >>> rankdata([0, 2, 3, 2], method='ordinal') array([ 1, 2, 4, 3]) >>> rankdata([[0, 2], [3, 2]]).reshape(2,2) array([[1. , 2.5], [4. , 2.5]]) >>> rankdata([[0, 2, 2], [3, 2, 5]], axis=1) array([[1. , 2.5, 2.5], [2. , 1. , 3. ]]) >>> rankdata([0, 2, 3, np.nan, -2, np.nan], nan_policy="propagate") array([nan, nan, nan, nan, nan, nan]) >>> rankdata([0, 2, 3, np.nan, -2, np.nan], nan_policy="omit") array([ 2., 3., 4., nan, 1., nan])