siegelslopes#
- scipy.stats.siegelslopes(y, x=None, method='hierarchical')[source]#
- Computes the Siegel estimator for a set of points (x, y). - siegelslopesimplements a method for robust linear regression using repeated medians (see [1]) to fit a line to the points (x, y). The method is robust to outliers with an asymptotic breakdown point of 50%.- Parameters:
- yarray_like
- Dependent variable. 
- xarray_like or None, optional
- Independent variable. If None, use - arange(len(y))instead.
- method{‘hierarchical’, ‘separate’}
- If ‘hierarchical’, estimate the intercept using the estimated slope - slope(default option). If ‘separate’, estimate the intercept independent of the estimated slope. See Notes for details.
 
- Returns:
- resultSiegelslopesResultinstance
- The return value is an object with the following attributes: - slopefloat
- Estimate of the slope of the regression line. 
- interceptfloat
- Estimate of the intercept of the regression line. 
 
 
- result
 - See also - theilslopes
- a similar technique without repeated medians 
 - Notes - With - n = len(y), compute- m_jas the median of the slopes from the point- (x[j], y[j])to all other n-1 points.- slopeis then the median of all slopes- m_j. Two ways are given to estimate the intercept in [1] which can be chosen via the parameter- method. The hierarchical approach uses the estimated slope- slopeand computes- interceptas the median of- y - slope*x. The other approach estimates the intercept separately as follows: for each point- (x[j], y[j]), compute the intercepts of all the n-1 lines through the remaining points and take the median- i_j.- interceptis the median of the- i_j.- The implementation computes n times the median of a vector of size n which can be slow for large vectors. There are more efficient algorithms (see [2]) which are not implemented here. - For compatibility with older versions of SciPy, the return value acts like a - namedtupleof length 2, with fields- slopeand- intercept, so one can continue to write:- slope, intercept = siegelslopes(y, x) - References [1] (1,2)- A. Siegel, “Robust Regression Using Repeated Medians”, Biometrika, Vol. 69, pp. 242-244, 1982. [2]- A. Stein and M. Werman, “Finding the repeated median regression line”, Proceedings of the Third Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 409-413, 1992. - Examples - >>> import numpy as np >>> from scipy import stats >>> import matplotlib.pyplot as plt - >>> x = np.linspace(-5, 5, num=150) >>> y = x + np.random.normal(size=x.size) >>> y[11:15] += 10 # add outliers >>> y[-5:] -= 7 - Compute the slope and intercept. For comparison, also compute the least-squares fit with - linregress:- >>> res = stats.siegelslopes(y, x) >>> lsq_res = stats.linregress(x, y) - Plot the results. The Siegel regression line is shown in red. The green line shows the least-squares fit for comparison. - >>> fig = plt.figure() >>> ax = fig.add_subplot(111) >>> ax.plot(x, y, 'b.') >>> ax.plot(x, res[1] + res[0] * x, 'r-') >>> ax.plot(x, lsq_res[1] + lsq_res[0] * x, 'g-') >>> plt.show() 