ggupta
ggupta

Reputation: 717

customize the value of x axis in histogram in python with pandas.plot

I want to set the customized value on x-axis of hist graph

i have a dataframe with column A, having data ranges from 0 to 500

i wanted to draw the distributed graph with customized range, like 0-20, 20-40, 40-60, 60-80, 80-100 and 100-500

my code is look like

df['A'].plot(kind='hist', range=[0,500])

this is giving equal range, but not what i'm looking for.

Upvotes: 1

Views: 3426

Answers (1)

abhilb
abhilb

Reputation: 5757

You can try np.select to group the data into the required groups like this.

>>> data = np.random.randint(0,500, size=15)
>>> data
array([ 44, 271, 293, 158, 479, 303,  32,  79, 314, 240,  95, 412, 150,
       356, 376])
>>> np.select([data <= 20, data <= 40, data <= 60, data <= 80, data <= 100, data <= 500], [1,2,3,4,5,6], data)
array([3, 6, 6, 6, 6, 6, 2, 4, 6, 6, 5, 6, 6, 6, 6])

So you need to add a new column to your data frame like this

>>> df = pd.DataFrame(np.random.randint(0,500,size=1000), columns = list("A"))
>>> df.head(4)
     A
0  179
1  136
2  114
3  124
>>> df["groups"] = np.select([df.A <= 20, df.A <= 40, df.A <= 60, df.A <= 80, df.A <= 100, df.A <= 500], [1,2,3,4,5,6], df.A)
>>> df.head(4)
     A  groups
0  179       6
1  136       6
2  114       6
3  124       6

Then you can plot the histogram like this.

>>> df1 = pd.DataFrame({'count' : df.groups.value_counts(sort=False), 'names' : ["0-20", "20-40", "40-60", "60-80", "80-100", "100-500"]})
>>> df1.plot.bar(x='names', y='count')
<matplotlib.axes._subplots.AxesSubplot object at 0x0000000018CD2808>
>>> plt.show()

Upvotes: 2

Related Questions