Reputation: 25
I want to find the 3% percentile of the following data and then average the data. Given below is the data structure.
0 NaN 1 NaN 2 NaN 3 NaN 4 NaN ... ... 96927 NaN 96928 NaN 96929 NaN 96930 NaN 96931 NaN
Over here the concerned data lies exactly between the data from 13240:61156. Given below is my code:
enter code here
import pandas as pd
import numpy as np
load_var=pd.read_excel(r'path\file name.xlsx')
load_var
a=pd.DataFrame(load_var['column whose percentile is to be found'])
print(a)
b=np.nanpercentile(a,3)
print(b)
Please suggest the changes in the code.
Thank you.
Upvotes: 0
Views: 31
Reputation: 862731
Use Series.quantile
with mean
in Series.agg
:
df = pd.DataFrame({
'col':[7,8,9,4,2,3, np.nan],
})
f = lambda x: x.quantile(0.03)
f.__name__ = 'q'
s = df['col'].agg(['mean', f])
print (s)
mean 5.50
q 2.15
Name: col, dtype: float64
Upvotes: 1