GeoCom
GeoCom

Reputation: 1374

How to generate a bar chart of occurrences per year in matplotlib python?

I have list of dates and I want to generate a bar chart with matplotlib in python.

2007-05-06
2007-05-11
2007-06-01
2007-06-04
2007-06-06
2007-09-01
2007-10-06
2007-11-06
2007-11-07
…

And I want to provide this two type of bar char

enter image description here

I can use this code but I'm searching for more efficient code because as you can see I have years between 2007 and 2012 and sometimes this range can be wider

def plot():
    #--- the two samples ---
    samples1 = [1, 1, 1, 3, 2, 5, 1, 10, 10, 8]
    samples2 = [6, 6, 6, 1, 2, 3, 9, 12 ] 
    samples3 = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 10, 11, 12]

    N = 12 # number of bins
    hist1 = np.array([0] * N )
    hist2 = np.array([0] * N )
    hist3 = np.array([0] * N )

    #--- create two histogram. Values of 1 go in Bin 0 ---
    for x in samples1:
        hist1[x-1] += 1
    for x in samples2:
        hist2[x-1] += 1
    for x in samples3:
        hist3[x-1] += 1

    #--- display the bar-graph ---        
    width = 1
    p1 = plt.bar( np.arange(0,N)+0.5, hist1, width, color='#9932cc' )
    p2 = plt.bar( np.arange(0,N)+0.5, hist2, width, color='#ffa500', bottom=hist1 )
    p3 = plt.bar( np.arange(0,N)+0.5, hist3, width, color='#d2691e', bottom=hist1+hist2 )
    plt.legend( (p1[0], p2[0], p3[0]), ( 'hist1', 'hist2', 'hist3' ) )
    plt.xlabel( 'Bins' )
    plt.ylabel( 'Count' )
    #plt.axis([1, 46, 0, 6])
    plt.xticks( np.arange( 1,N+1 ) )
    plt.axis( [width/2.0, N+width/2.0, 0, max( hist1+hist2+hist3)] )
    plt.show()

Can you help me to generate this kind of chart !?

Thank you

Upvotes: 2

Views: 2220

Answers (2)

xnx
xnx

Reputation: 25478

The two plots are generated in a very similar way, so I'll do the first one only. You need to loop over the months, and to get a stacked bar-chart set the bottom of each month's bar to the cumulative sum of the previous months' values for each year:

import numpy as np
import matplotlib.pyplot as plt

months = ('Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun',
          'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec')

# Some random data for nyears from minyear
nyears = 8
nmonths = len(months)
minyear = 2005
monthly_counts = np.random.randint(low=2, high=15, size=(nyears,nmonths))

fig, ax = plt.subplots()
ind = np.arange(nyears)
width = 0.45
# Random colors for the months
c = np.random.rand(nmonths,3,1)

p = []
for imonth in range(nmonths):
    p.append(ax.bar(ind, monthly_counts[:,imonth], width,
                    bottom=np.sum(monthly_counts[:,:imonth], axis=1),
                    color=c[imonth], alpha=0.8)
            )

# Set x axis ticks and labels
ax.set_xticks(ind + width/2)
ax.set_xticklabels([str(minyear+i) for i in ind])

# Locate legend outside axes plot area
box = ax.get_position()
ax.set_position([box.x0, box.y0, box.width * 0.8, box.height])
ax.legend([pl[0] for pl in p], months, loc='center left', bbox_to_anchor=(1, 0.5))

plt.show()

enter image description here

Upvotes: 2

Ed Smith
Ed Smith

Reputation: 13206

You can use numpy histogram to get the data in bar format directly, which should be faster than looping in python. As a minimal example based on your data above,

import numpy as np
import matplotlib.pyplot as plt

#--- the two samples ---
samples1 = [1, 1, 1, 3, 2, 5, 1, 10, 10, 8]
samples2 = [6, 6, 6, 1, 2, 3, 9, 12 ] 
samples3 = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 10, 11, 12]

N = 12 # number of bins
hist1 = np.array([0] * N )
hist2 = np.array([0] * N )
hist3 = np.array([0] * N )

#--- create two histogram. Values of 1 go in Bin 0 ---
hist1, n = np.histogram(samples1,N)
hist2, n = np.histogram(samples2,N)
hist3, n = np.histogram(samples3,N)

#--- display the bar-graph ---        
width = 1
p1 = plt.bar( np.arange(0,N)+0.5, hist1, width, color='#9932cc' )
p2 = plt.bar( np.arange(0,N)+0.5, hist2, width, color='#ffa500', bottom=hist1 )
p3 = plt.bar( np.arange(0,N)+0.5, hist3, width, color='#d2691e', bottom=hist1+hist2 )
plt.legend( (p1[0], p2[0], p3[0]), ( '2010', '2011', '2012' ) )
plt.xlabel( 'Bins' )
plt.ylabel( 'Count' )
import datetime
months = [((datetime.date(2010, i, 1).strftime('%B'))[:3]) for i in range(1,13)]
plt.xticks( np.arange( 1,N+1 ),months )
plt.axis( [width/2.0, N+width/2.0, 0, max( hist1+hist2+hist3)] )
plt.show()

Upvotes: 3

Related Questions