Reputation: 367
I have a dataset df
:
users number
user1 1
user2 34
user3 56
user4 45
user5 4
user1 3
user5 11
user1 3
when making a barplot like this:
plt.bar(x['users'], x['number'].sort_values(ascending=False), color="blue")
Does it take the mean of every user
in the number
column during the plot?
What if I want the sum of all the numbers in the number
column to appear in the barplot in descending order?
I tried this:
plt.bar(x['users'], x['number'].sum().sort_values(ascending=False), color="blue")
which gives:
AttributeError: 'numpy.float64' object has no attribute 'sort_values'
code:
import pandas as pd
df = pd.DataFrame({'number': [10,34,56,45,33],
'user': ['user1','user2','user3','user4','user1']})
#index=['user1','user2','user3','user4','user1'])
plt.bar(df['user'], df['number'], color="blue")
It always keeps the biggest value for the user that has many values.
Upvotes: 2
Views: 7775
Reputation: 39072
I am not sure if this is what you want OR do you want to first groupby
the values for each user and then plot the total numbers in descending order.
x = x.sort_values('number',ascending=False)
plt.bar(range(len(x['users'])), x['number'], color="blue")
plt.xticks(range(len(x['users'])), x['users'])
plt.ylabel('Numbers')
Output
If you want to plot the mean of each user, use the following code:
x1 = x.groupby('users').mean().reset_index()
plt.bar(range(len(x1)), x1['number'], color="blue")
plt.xticks(range(len(x1)), x1['users'])
plt.ylabel('Mean')
Output
What if you don't sort or group by: All bars are present but you don't see the different bars for same x-value because alpha=1
by default. I used alpha=0.2
to highlight my point. Now you see that at user1
there are two bars behind each other.
import pandas as pd
df = pd.DataFrame({'number': [10,34,56,45,51], 'user': 'user1','user2','user3','user4','user1']})
plt.bar(df['user'], df['number'], color="blue", linewidth =2, edgecolor='black' , alpha = 0.2)
Output
Upvotes: 2