Reputation: 506
I have the df as seen below, and I'm trying to visualize how many of each 'Disaster_Type' happened each month, to see if the start date of winter storms has shifted/change. I thought the best way would be with a stacked bar graph, but am open to suggestions.
The y axis therefore should be disaster count, with the x being time, and each bar representing a month of the year (only those with a weather event).
Each bar then would have different colors for each type of disaster, and illustrate the count of each.
I created the df from a larger set, and split the datetime format into separate columns.
However, I know I have to use the groupby and count function to further sort the data, but I'm not quite sure how to go about it
#Current attempt
df_winter_start.groupby('month').count
full code:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.colors import ListedColormap
import pandas as pd
import seaborn as sns
df_time = pd.read_pickle('df_time.pkl')
df_winter = df_time[(df_time['Disaster_Type'] == 'Winter') | (df_time['Disaster_Type'] == 'Snow') | (df_time['Disaster_Type'] == 'Ice')]
df_winter.drop_duplicates(keep='first')
df_winter = df_winter.reset_index(drop=True, inplace=False)
df_winter['day'] = df_winter['Start_Date_A'].dt.day
df_winter['month'] = df_winter['Start_Date_A'].dt.month
df_winter['year'] = df_winter['Start_Date_A'].dt.year
df_winter_start = df_winter.drop(columns=['County', 'Start_Date_A','End_Date_A', 'Start_year', 'Disaster_Length',])
fig, ax = plt.subplots(figsize=(12,8))
colors = [
'#1f77b4', '#aec7e8', '#ff7f0e', '#ffbb78', '#2ca02c', '#98df8a',
'#d62728', '#ff9896', '#9467bd', '#c5b0d5', '#8c564b', '#c49c94',
'#e377c2', '#f7b6d2', '#7f7f7f']
cmap = ListedColormap(colors, name="custom")
df_winter_start.pivot(columns= 'Disaster_Type', values='month' ).plot.bar(stacked=True, ax=ax, zorder=3,cmap=cmap)
ax.legend(ncol=3, edgecolor='w')
[ax.spines[s].set_visible(False) for s in ['top','right', 'left']]
ax.tick_params(axis='both', left=False, bottom=False)
plt.title(label='Monthly Winter Storm Counts from 1965-2017', size=20)
ax.grid(axis='y', dashes=(8,3), color='grey', alpha=0.3)
df_winter_start
Disaster_Type day month year
0 Ice 10 2 1968
1 Ice 10 2 1968
2 Ice 10 2 1968
3 Ice 10 2 1968
4 Ice 10 2 1968
5 Ice 10 2 1968
6 Ice 10 2 1968
7 Ice 10 2 1968
8 Ice 10 2 1968
9 Ice 10 2 1968
10 Ice 10 2 1968
11 Ice 10 2 1968
12 Ice 10 2 1968
13 Ice 10 2 1968
14 Ice 10 2 1968
15 Ice 10 2 1968
16 Ice 10 2 1968
17 Ice 10 2 1968
18 Ice 10 2 1968
19 Ice 10 2 1968
20 Ice 10 2 1968
21 Winter 15 3 1971
22 Winter 15 3 1971
23 Winter 15 3 1971
24 Winter 15 3 1971
25 Winter 15 3 1971
26 Winter 15 3 1971
27 Snow 5 4 1972
28 Snow 5 4 1972
29 Snow 5 4 1972
30 Snow 5 4 1972
31 Snow 5 4 1972
32 Snow 5 4 1972
33 Snow 5 4 1972
34 Snow 5 4 1972
35 Snow 5 4 1972
36 Winter 24 6 1974
37 Winter 24 6 1974
38 Ice 19 3 1976
39 Ice 19 3 1976
40 Ice 19 3 1976
41 Ice 19 3 1976
42 Ice 19 3 1976
43 Ice 19 3 1976
44 Ice 19 3 1976
45 Ice 8 4 1976
46 Ice 8 4 1976
47 Ice 8 4 1976
48 Ice 8 4 1976
49 Ice 8 4 1976
Upvotes: 3
Views: 96
Reputation: 10545
Or you could group df_winter_start
by type and month, counting up the events, and then use seaborn for plotting:
df_counts = df_winter_start[['Disaster_Type',
'month']].value_counts().reset_index(name='count')
df_counts
Disaster_Type month count
0 Ice 2 21
1 Snow 4 9
2 Ice 3 7
3 Winter 3 6
4 Ice 4 5
5 Winter 6 2
import seaborn as sns
cmap = {'Ice': '#1f77b4', 'Snow': '#aec7e8', 'Winter': '#ff7f0e'}
sns.barplot(data=df_counts, x='month', y='count',
hue='Disaster_Type', palette=cmap);
Upvotes: 1
Reputation: 35636
Very close. The reshaping operation should be a pivot_table
with the aggfunc set to count rather than pivot
:
plot_df = (
df_winter_start.pivot_table(index='month', columns='Disaster_Type',
values='day', aggfunc='count')
)
Now that the data is in the correct format:
plot_df
Disaster_Type Ice Snow Winter
month
2 21.0 NaN NaN
3 7.0 NaN 6.0
4 5.0 9.0 NaN
6 NaN NaN 2.0
It can be plotted with stacked bars:
plot_df.plot.bar(stacked=True, ax=ax, zorder=3, cmap=cmap, rot=0)
or as separate bars:
plot_df.plot.bar(ax=ax, zorder=3, cmap=cmap, rot=0)
Upvotes: 2