Note that I sometimes wanted to use quartiles to detect outliers
Q1 = series.quantile(.25)
Q3 = series.quantile(.75)
Or
Q1 = series.describe()['25%']
Q3 = series.describe()['75%']
#Extract only the data whose value is out of order in column A
IQR = Q3 - Q1
threshold = Q3 + 1.5 * IQR
df_outlier = df[df['A'].apply(lambda x:x > threshold)]
On the contrary, if you want the data that fits, you can take a logical negation like df [~ df ...] It's good if you change the direction of the inequality sign.
There seems to be a guy who can get outliers in one shot without doing this ...
Statistics beginners are tinkering with data using pandas. I would be grateful if you could tell me if there is any good way.
http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.quantile.html
Refer to this document for quartiles http://www.contents-station.net/gacco/Data_Analysis_Innovation/Week03/3-4.pdf