Reputation: 1
I'm coding in Python using the library matplotlib to make a multi-bar graph, like a double-bar graph but I have 7 bars per category. I'm not sure what I have is necessarily the best way for me to plot this data, but it's working.
However some of the data for those 7 bars per group are missing, where the value is 0 or NaN. This is creating gaps in the data where the bar is missing as well.
Is it possible to modify the graphing code to skip the bars that have a value of 0 without messing with the indents and spacing, etc?
Here is the data I'm using. I've posted it here as a dictionary to copy over but my format is using a DataFrame.
# Libraries
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
# Dictionary:
bardict = {"B": [0.14,0.12,0.02,0.02,nan,nan,nan],
"L": [0.08,0.14,0.06,0.06,0.07,0.12,0.08],
"M": [0.08,0.09,0.07,0.08,0.22,0.15,0.06],
"C": [0.11,0.10,0.13,0.35,nan,0.11,0.21],
"S": [nan,0.11,0.17,0.46,0.09,0.10,0.08],
"W": [0.12,0.09,0.29,0.63,0.10,0.38,0.26]}
# DataFrame saved under name bar_C:
B C L M S W
0 0.14 0.11 0.08 0.08 NaN 0.12
1 0.12 0.10 0.14 0.09 0.11 0.09
2 0.02 0.13 0.06 0.07 0.17 0.29
3 0.02 0.35 0.06 0.08 0.46 0.63
4 NaN NaN 0.07 0.22 0.09 0.10
5 NaN 0.11 0.12 0.15 0.10 0.38
6 NaN 0.21 0.08 0.06 0.08 0.26
And finally, here is my code so far for making the bar graph: (Note* - I don't want to arrange the bar graph by value, I want them kept in the same order listed (B, C, L, M...)
# Convert dictionary to DataFrame
bar_C = pd.DataFrame(bardict)
fig, ax = plt.subplots(figsize=(10, 6))
N = 6
ind = np.arange(N) #*
width = 0.55
num3_vals = bar_C.loc[0]
num3 = plt.bar(ind*N+(width*0), num3_vals, width, color = 'mediumturquoise')
num4_vals = bar_C.loc[1]
num4 = plt.bar(ind*N+(width*1), num4_vals, width, color='darkred')
num5_vals = bar_C.loc[2]
num5 = plt.bar(ind*N+(width*2), num5_vals, width, color='lightgreen')
num6_vals = bar_C.loc[3]
num6 = plt.bar(ind*N+(width*3), num6_vals, width, color='purple')
num7_vals = bar_C.loc[4]
num7 = plt.bar(ind*N+(width*4), num7_vals, width, color='gray')
print(num7_vals)
num8_vals = bar_C.loc[5]
num8 = plt.bar(ind*N+(width*5), num8_vals, width, color='orange')
num9_vals = bar_C.loc[6]
num9 = plt.bar(ind*N+(width*6), num9_vals, width, color='darkblue')
plt.xlabel("Group")
plt.ylabel('Value')
plt.title("Multi-Bar Graph")
ax = plt.gca()
plt.xticks(ticks = (ind*N),labels = ['B','C','L','M','S','W'])
plt.legend((num3, num4, num5, num6, num7, num8, num9),
('3','4','5','6','7','8','9'),
loc='upper left')
plt.show()
Output:
B NaN
L 0.07
M 0.22
C NaN
S 0.09
W 0.10
Name: 4, dtype: float64
Pic:
I'm a brand new user - so firstly thanks in advance and secondly, please let me know if there's any formatting errors.
I am hoping for all the bars with data, AKA not being zeroes, to be plotted as available. It worked except it left a blank spot where a bar would be for the zero values. I'd like to clean it up so that the empty spaces are gone without having to delete the data.
I tried using fillna() to make everything 0 instead of NaN, but this didn't change the graph in any way.
Upvotes: 0
Views: 41
Reputation: 3096
My attempt:
# Libraries
import pandas as pd
print('pandas : ', pd.__version__)
import matplotlib.pyplot as plt
import numpy as np
# Dictionary:
bardict = {"B": [0.14,0.12,0.02,0.02,np.nan,np.nan,np.nan],
"L": [0.08,0.14,0.06,0.06,0.07,0.12,0.08],
"M": [0.08,0.09,0.07,0.08,0.22,0.15,0.06],
"C": [0.11,0.10,0.13,0.35,np.nan,0.11,0.21],
"S": [np.nan,0.11,0.17,0.46,0.09,0.10,0.08],
"W": [0.12,0.09,0.29,0.63,0.10,0.38,0.26]}
# Convert dictionary to DataFrame
bar_C = pd.DataFrame(bardict)
print(bar_C)
bar_C.fillna(-2, inplace=True)
print(bar_C)
fig, ax = plt.subplots(figsize=(10, 6))
N = 6
ind = np.arange(N) #*
width = 0.55
num3_vals = bar_C.loc[0]
num3 = plt.bar(ind*N+(width*0), num3_vals, width, color = 'mediumturquoise')
num4_vals = bar_C.loc[1]
num4 = plt.bar(ind*N+(width*1), num4_vals, width, color='darkred')
num5_vals = bar_C.loc[2]
num5 = plt.bar(ind*N+(width*2), num5_vals, width, color='lightgreen')
num6_vals = bar_C.loc[3]
num6 = plt.bar(ind*N+(width*3), num6_vals, width, color='purple')
num7_vals = bar_C.loc[4]
print(num7_vals)
num7 = plt.bar(ind*N+(width*4), num7_vals, width, color='red')
num8_vals = bar_C.loc[5]
num8 = plt.bar(ind*N+(width*5), num8_vals, width, color='orange')
num9_vals = bar_C.loc[6]
num9 = plt.bar(ind*N+(width*6), num9_vals, width, color='darkblue')
plt.xlabel("Group")
plt.ylabel('Value')
plt.title("Multi-Bar Graph")
ax = plt.gca()
ax.set_ybound(lower=-0.05,)
plt.xticks(ticks = (ind*N),labels = ['B','C','L','M','S','W'])
plt.legend((num3, num4, num5, num6, num7, num8, num9),
('3','4','5','6','7','8','9'),
loc='upper left')
plt.show()
Output:
pandas : 2.2.3
B L M C S W
0 0.14 0.08 0.08 0.11 NaN 0.12
1 0.12 0.14 0.09 0.10 0.11 0.09
2 0.02 0.06 0.07 0.13 0.17 0.29
3 0.02 0.06 0.08 0.35 0.46 0.63
4 NaN 0.07 0.22 NaN 0.09 0.10
5 NaN 0.12 0.15 0.11 0.10 0.38
6 NaN 0.08 0.06 0.21 0.08 0.26
B L M C S W
0 0.14 0.08 0.08 0.11 -2.00 0.12
1 0.12 0.14 0.09 0.10 0.11 0.09
2 0.02 0.06 0.07 0.13 0.17 0.29
3 0.02 0.06 0.08 0.35 0.46 0.63
4 -2.00 0.07 0.22 -2.00 0.09 0.10
5 -2.00 0.12 0.15 0.11 0.10 0.38
6 -2.00 0.08 0.06 0.21 0.08 0.26
B -2.00
L 0.07
M 0.22
C -2.00
S 0.09
W 0.10
Name: 4, dtype: float64
Pics:
Thanks to :
bar_C.fillna(-2, inplace=True)
or
bar_C = bar_C.fillna(-2, inplace=False)
See :
Returns:
Series/DataFrame or None
Object with missing values filled or None if inplace=True.
and :
Upvotes: 0