Eran
Eran

Reputation: 844

Plotting two date histograms with pandas

I have two datetime series, which I'm trying to plot side by side with a shared X-axis.

dates1 = ['2015-02-02', '2016-06-29', '2016-06-01', '2015-07-19', '2016-08-17', '2016-11-22',
'2016-07-24', '2016-10-30', '2015-02-01', '2017-01-29', '2015-03-19', '2016-09-06',
'2016-11-23', '2016-06-21', '2016-10-05', '2016-02-23', '2016-11-24', '2016-10-05',
'2015-07-16', '2016-06-07', '2016-07-31', '2016-11-01', '2016-11-02', '2016-08-16',
'2015-06-09', '2016-04-11', '2017-02-09', '2015-05-20', '2016-05-17', '2016-09-12',
'2015-08-05', '2017-02-19']

dates2 = ['2016-03-22', '2016-03-16', '2015-07-02', '2016-09-13', '2014-09-04', '2016-07-12',
'2016-05-08', '2016-02-18', '2014-07-10', '2016-05-10', '2016-05-02', '2016-11-20',
'2015-05-19', '2016-01-06', '2016-06-21', '2015-03-25', '2016-06-09', '2016-12-07',
'2016-10-18', '2016-03-27', '2017-03-19', '2016-10-27', '2017-01-12', '2015-12-31',
'2016-05-05', '2016-07-17', '2016-07-10', '2017-06-14', '2015-12-27', '2016-03-01', 
'2016-05-04', '2017-05-15']

ser1 = pd.Series(dates1, dtype=np.datetime64)
ser2 = pd.Series(dates2, dtype=np.datetime64)

fig, axes = plt.subplots(1, 2, figsize=(20, 10), sharex=True)
ser1.groupby([ser1.dt.year, ser1.dt.month]).count().plot(kind='bar', ax=axes[0])
ser2.groupby([ser2.dt.year, ser2.dt.month]).count().plot(kind='bar', ax=axes[1])
plt.show()

enter image description here

As seen in the image, it appears that ser1 has (2014, 7) values, but its first actual value is 2015-02-01. For reference, the two plots with sharex=False:

fig, axes = plt.subplots(1, 2, figsize=(20, 10), sharex=False)
ser1.groupby([ser1.dt.year, ser1.dt.month]).count().plot(kind='bar', ax=axes[0])
ser2.groupby([ser2.dt.year, ser2.dt.month]).count().plot(kind='bar', ax=axes[1])
plt.show()

enter image description here

Any simple way to solve this, without manually limiting the X-axis?

Upvotes: 0

Views: 31

Answers (1)

Shovalt
Shovalt

Reputation: 6776

You can concatenate the ser1 and ser2 groupby count results, which will cause rows of NaNs to appear in missing dates of both series. Then simply fillna with zeros and proceed with the same plot method:

sgp1 = ser1.groupby([ser1.dt.year, ser1.dt.month]).count()
sgp2 = ser2.groupby([ser2.dt.year, ser2.dt.month]).count()

df = pd.concat([sgp1, sgp2], axis=1).fillna(0)

fig, axes = plt.subplots(1, 2, figsize=(20, 10), sharex=True)

df[0].plot(kind='bar', ax=axes[0])
df[1].plot(kind='bar', ax=axes[1])
plt.show()

Result: enter image description here

Upvotes: 2

Related Questions