JamesArthur
JamesArthur

Reputation: 506

Problem creating a time series stacked bar graph depicting number of times an event happened in a month

I have the df as seen below, and I'm trying to visualize how many of each 'Disaster_Type' happened each month, to see if the start date of winter storms has shifted/change. I thought the best way would be with a stacked bar graph, but am open to suggestions.

The y axis therefore should be disaster count, with the x being time, and each bar representing a month of the year (only those with a weather event).

Each bar then would have different colors for each type of disaster, and illustrate the count of each.

I created the df from a larger set, and split the datetime format into separate columns.

However, I know I have to use the groupby and count function to further sort the data, but I'm not quite sure how to go about it

#Current attempt

df_winter_start.groupby('month').count

full code:

 
import numpy as np
import matplotlib.pyplot as plt 
from matplotlib.colors import ListedColormap
import pandas as pd 
import seaborn as sns 


df_time = pd.read_pickle('df_time.pkl')


df_winter = df_time[(df_time['Disaster_Type'] == 'Winter') | (df_time['Disaster_Type'] == 'Snow') | (df_time['Disaster_Type'] == 'Ice')]

df_winter.drop_duplicates(keep='first')

df_winter = df_winter.reset_index(drop=True, inplace=False)


df_winter['day'] = df_winter['Start_Date_A'].dt.day
df_winter['month'] = df_winter['Start_Date_A'].dt.month
df_winter['year'] = df_winter['Start_Date_A'].dt.year


df_winter_start = df_winter.drop(columns=['County', 'Start_Date_A','End_Date_A', 'Start_year', 'Disaster_Length',])


fig, ax = plt.subplots(figsize=(12,8))


colors = [
    '#1f77b4', '#aec7e8', '#ff7f0e', '#ffbb78', '#2ca02c', '#98df8a',
    '#d62728', '#ff9896', '#9467bd', '#c5b0d5', '#8c564b', '#c49c94',
    '#e377c2', '#f7b6d2', '#7f7f7f']

cmap = ListedColormap(colors, name="custom")

df_winter_start.pivot(columns= 'Disaster_Type', values='month' ).plot.bar(stacked=True, ax=ax, zorder=3,cmap=cmap)
ax.legend(ncol=3, edgecolor='w')
[ax.spines[s].set_visible(False) for s in ['top','right', 'left']]
ax.tick_params(axis='both', left=False, bottom=False)
plt.title(label='Monthly Winter Storm Counts from 1965-2017', size=20)


ax.grid(axis='y', dashes=(8,3), color='grey', alpha=0.3)

df_winter_start

   Disaster_Type  day  month  year
0            Ice   10      2  1968
1            Ice   10      2  1968
2            Ice   10      2  1968
3            Ice   10      2  1968
4            Ice   10      2  1968
5            Ice   10      2  1968
6            Ice   10      2  1968
7            Ice   10      2  1968
8            Ice   10      2  1968
9            Ice   10      2  1968
10           Ice   10      2  1968
11           Ice   10      2  1968
12           Ice   10      2  1968
13           Ice   10      2  1968
14           Ice   10      2  1968
15           Ice   10      2  1968
16           Ice   10      2  1968
17           Ice   10      2  1968
18           Ice   10      2  1968
19           Ice   10      2  1968
20           Ice   10      2  1968
21        Winter   15      3  1971
22        Winter   15      3  1971
23        Winter   15      3  1971
24        Winter   15      3  1971
25        Winter   15      3  1971
26        Winter   15      3  1971
27          Snow    5      4  1972
28          Snow    5      4  1972
29          Snow    5      4  1972
30          Snow    5      4  1972
31          Snow    5      4  1972
32          Snow    5      4  1972
33          Snow    5      4  1972
34          Snow    5      4  1972
35          Snow    5      4  1972
36        Winter   24      6  1974
37        Winter   24      6  1974
38           Ice   19      3  1976
39           Ice   19      3  1976
40           Ice   19      3  1976
41           Ice   19      3  1976
42           Ice   19      3  1976
43           Ice   19      3  1976
44           Ice   19      3  1976
45           Ice    8      4  1976
46           Ice    8      4  1976
47           Ice    8      4  1976
48           Ice    8      4  1976
49           Ice    8      4  1976

Upvotes: 3

Views: 96

Answers (2)

Arne
Arne

Reputation: 10545

Or you could group df_winter_start by type and month, counting up the events, and then use seaborn for plotting:

df_counts = df_winter_start[['Disaster_Type', 
                             'month']].value_counts().reset_index(name='count')
df_counts
    Disaster_Type   month   count
0   Ice             2       21
1   Snow            4       9
2   Ice             3       7
3   Winter          3       6
4   Ice             4       5
5   Winter          6       2
import seaborn as sns

cmap = {'Ice': '#1f77b4', 'Snow': '#aec7e8', 'Winter': '#ff7f0e'}

sns.barplot(data=df_counts, x='month', y='count', 
            hue='Disaster_Type', palette=cmap);

winter plot

Upvotes: 1

Henry Ecker
Henry Ecker

Reputation: 35636

Very close. The reshaping operation should be a pivot_table with the aggfunc set to count rather than pivot:

plot_df = (
    df_winter_start.pivot_table(index='month', columns='Disaster_Type',
                                values='day', aggfunc='count')
)

Now that the data is in the correct format:

plot_df

Disaster_Type   Ice  Snow  Winter
month                            
2              21.0   NaN     NaN
3               7.0   NaN     6.0
4               5.0   9.0     NaN
6               NaN   NaN     2.0

It can be plotted with stacked bars:

plot_df.plot.bar(stacked=True, ax=ax, zorder=3, cmap=cmap, rot=0)

plot 1 stacked

or as separate bars:

plot_df.plot.bar(ax=ax, zorder=3, cmap=cmap, rot=0)

plot 2 separate bars

Upvotes: 2

Related Questions