Reputation: 427
In the following simplified example, I wish to display the sum of each stacked barplot (3 for A and 7 for B), yet my code displays all the values, not the summary statistics. What am I doing wrong? Thank you in advance.
import io
import pandas as pd
import plotnine as p9
data_string = """V1,V2,value
A,a,1
A,b,2
B,a,3
B,b,4"""
data = io.StringIO(data_string)
df = pd.read_csv(data, sep=",")
p9.ggplot(df, p9.aes(x='V1', y='value', fill = 'V2')) + \
p9.geom_bar(stat = 'sum') + \
p9.stat_summary(p9.aes(label ='stat(y)'), fun_y = sum, geom = "text")
Upvotes: 2
Views: 1062
Reputation: 125373
The issue is the grouping of your data. As you have a global fill
aesthetic your data gets grouped by categories of V2
. Hence stat_summary
computes the sum per group of V2
. To solve this issue make fill
a local aesthetic of geom_bar
or geom_col
.
import io
import pandas as pd
import plotnine as p9
data_string = """V1,V2,value
A,a,1
A,b,2
B,a,3
B,b,4"""
data = io.StringIO(data_string)
df = pd.read_csv(data, sep=",")
p9.ggplot(df, p9.aes(x='V1', y='value')) + \
p9.geom_col(p9.aes(fill = 'V2')) + \
p9.stat_summary(p9.aes(label ='stat(y)'), fun_y = sum, geom = "text")
Another option would be to override the global grouping by setting group=1
in stat_summary
:
p9.stat_summary(p9.aes(label ='stat(y)', group = 1), fun_y = sum, geom = "text")
Upvotes: 1