Reputation: 4295
Is there any library to plot a histogram by percentiles based on a series? I have been digging around pandas but i do not see any available methods for such. I do know of a long workaround which is to manually calculate the number of occurences for each percentile i want. But i think there probably is a better solution.
Currently what i have to get the individual counts
# Sample series
tenth = df.col.quantile(0.1)
twenty = df.col.quantile(0.2)
twenty_count = len(twenty - tenth)
And so on...
However using describe. I manage to get this
df.describe(percentiles = [x/10.0 for x in range(1,11)]
Upvotes: 1
Views: 6790
Reputation: 294318
IIUC
df.col.rank(pct=True).hist()
However, this is a bad idea.
Consider the following dataframe df
df = pd.DataFrame(dict(
col=np.random.randn(1000),
col2=np.random.rand(1000)
))
Then
df.col.rank(pct=True).hist()
Which is a silly graph.
Instead, divide by the maximum absolute value
(df / df.abs().max()).hist()
Upvotes: 5