Reputation: 99
I cannot find any explanation about numpy.where function.
I have dataframe:
activity count
0 1 25
1 2 38
2 3 49
3 4 55
4 5 61
5 6 74
6 7 82
7 8 98
8 9 103
9 10 4084
and I want to count values:
and crate a charts for this two condition which shows sum count values. I have tried:
under_total = df[df.activity < 6]
under_total.loc['under_total'] = under_total.sum(axis = 0)
OR
df['below_expectations'] = np.where(df['actibity'] < 6, df['activity'], df['hoursActive'])
But it seems I dont get it somewhere.
Upvotes: 1
Views: 214
Reputation: 863116
Use DataFrame.assign
for new column by np.where
and then aggregate sum
:
s = (df.assign(cat = np.where(df['activity'] < 6, 'less 6', 'higher 6'))
.groupby('cat')['count']
.sum())
print (s)
cat
higher 6 4441
less 6 228
Name: count, dtype: int64
and then plot by Series.plot.bar
:
s.plot.bar()
Upvotes: 0
Reputation: 261590
You can compute a group for GroupBy.sum
to feed it to plot.bar
:
(df.groupby(df['activity'].lt(6)
.map({True: '<6', False: '≥6'}))
['count'].sum().plot.bar()
)
or with pandas.cut
, which gives you the advantage to have as many categories as you want:
(df.groupby(pd.cut(df['activity'], bins=[0,6,float('inf')],
labels=['<6', '≥6'], right=False))
['count'].sum().plot.bar()
)
output:
intermediate data:
(df.groupby(df['activity'].lt(6)
.map({True: '<6', False: '≥6'}))
['count'].sum()
)
output:
activity
<6 228
≥6 4441
Name: count, dtype: int64
Upvotes: 1