Reputation: 1373
I would like to use a couple of columns as row ID while taking count of grouping based on Time. Look at below illustration:
X Y Z Time
0 1 2 10
0 2 3 10
1 0 2 15
1 0 0 23
Transforms into this:
Category Count Time
X 0 10
X 1 15
X 1 23
Y 3 10
Y 0 15
Y 0 23
Z 5 10
Z 2 15
Z 0 23
What is happening is that X occur 0 times for the time 10 but 1 time for 15 and 23.
Y occur 3 times at 10'clock but none at 15 and 23. etc.
Upvotes: 2
Views: 130
Reputation: 863651
I think you need melt
with groupby
aggregating sum
and last sort_values
by column Category
:
print pd.melt(df, id_vars='Time', var_name='Category', value_name='Count')
.groupby(['Time','Category']).sum().reset_index().sort_values('Category')
Time Category Count
0 10 X 0
3 15 X 1
6 23 X 1
1 10 Y 3
4 15 Y 0
7 23 Y 0
2 10 Z 5
5 15 Z 2
8 23 Z 0
Another solution with stack
:
df1 = df.set_index('Time')
.stack()
.groupby(level=[0,1])
.sum()
.reset_index()
.sort_values('level_1')
df1.columns = ['Time','Category','Count']
df1 = df1[['Category','Count','Time']]
print df1
Category Count Time
0 X 0 10
3 X 1 15
6 X 1 23
1 Y 3 10
4 Y 0 15
7 Y 0 23
2 Z 5 10
5 Z 2 15
8 Z 0 23
Upvotes: 2