Reputation: 147
I have a data frame:
import random
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
val = random.sample(range(0, 1000), 1000)
prob = []
for i in range(0,1000):
x = random.uniform(0,1)
prob.append(x)
d = {'Value': val, 'Probability': prob}
df = pd.DataFrame(data=d)
Here, I create an additional dataframe based on the values of df:
group_prob = df.groupby(pd.cut(df['Probability'], np.arange(0, 1.1, 0.1)))['Value'].mean()
group_prob = group_prob.fillna(0.0)
group_prob = pd.DataFrame(group_prob)
group_prob["Count"] = df.groupby(pd.cut(df['Probability'], np.arange(0, 1.1, 0.1)))['Value'].count()
group_prob["Text"] = group_prob['Value'].round(2).astype(str)+' - '+group_prob['Count'].astype(str)
I want to create a bar plot:
def barplot_groups(group_, var_names=['','']):
fig, ax = plt.subplots(figsize=(15,7))
sns.barplot(group_.index, group_.values, ax=ax)
max_val = group_.values.max()
plt.xlabel(f'{var_names[0]}')
plt.ylabel(f'Average of {var_names[1]}')
plt.title(f'Relationship between {var_names[0]} and {var_names[1]}')
plt.show()
This is my results:
barplot_groups(group_prob['Value'], ['Probability','Value'])
I also want to add labels to the plot based on group_prob['Text']
, since the values are long I want to place them vertically, what is the best way to do it using seaborn library?
This is an example what I am trying to add (the white boarder is not needed).
Upvotes: 0
Views: 694
Reputation: 35696
With matplotlib 3.4.0 or newer, bar_label
can be applied from a collection of labels like group_prob['Text']
:
def barplot_groups(group_, my_labels, var_names):
fig, ax = plt.subplots(figsize=(15, 7))
sns.barplot(x=group_.index, y=group_.values, ax=ax)
ax.set(xlabel=f'{var_names[0]}',
ylabel=f'Average of {var_names[1]}',
title=f'Relationship between {var_names[0]} and {var_names[1]}')
ax.bar_label(ax.containers[0], labels=my_labels, label_type='center',
rotation=90)
plt.show()
Function call:
barplot_groups(group_prob['Value'],
my_labels=group_prob['Text'],
var_names=['Probability', 'Value'])
group_prob
:
Value Count Text
Probability
(0.0, 0.1] 482.278846 104 482.28 - 104
(0.1, 0.2] 495.018692 107 495.02 - 107
(0.2, 0.3] 529.750000 92 529.75 - 92
(0.3, 0.4] 490.933333 105 490.93 - 105
(0.4, 0.5] 469.858491 106 469.86 - 106
(0.5, 0.6] 515.640777 103 515.64 - 103
(0.6, 0.7] 545.450980 102 545.45 - 102
(0.7, 0.8] 458.900000 80 458.9 - 80
(0.8, 0.9] 468.100000 110 468.1 - 110
(0.9, 1.0] 542.153846 91 542.15 - 91
Just add backgroundcolor='white'
to bar_label
for a white background:
ax.bar_label(ax.containers[0], labels=my_labels, label_type='center',
rotation=90, backgroundcolor='white')
Reproduceable with seed 5:
import random
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
random.seed(5)
val = random.sample(range(0, 1000), 1000)
prob = []
for i in range(0, 1000):
x = random.uniform(0, 1)
prob.append(x)
d = {'Value': val, 'Probability': prob}
df = pd.DataFrame(data=d)
group_prob = df.groupby(
pd.cut(df['Probability'], np.arange(0, 1.1, 0.1))
)['Value'].mean()
group_prob = group_prob.fillna(0.0)
group_prob = pd.DataFrame(group_prob)
group_prob["Count"] = df.groupby(
pd.cut(df['Probability'], np.arange(0, 1.1, 0.1))
)['Value'].count()
group_prob["Text"] = (
group_prob['Value'].round(2).astype(str)
+ ' - ' +
group_prob['Count'].astype(str)
)
Upvotes: 1