Reputation: 177
I have the table such that:
no Order materials status
1 1000 100 available
2 1000 200 not available
3 1001 500 Feb-20
4 1002 400 available
5 1002 300 not available
6 1002 600 available
7 1002 900 available
8 1003 700 available
9 1003 800 available
And I'd like to have columns that shows:
I was able to get the total number of materials per Order:
ds.groupby('Order').count()
ds['Total Materials'] = ds.groupby('Order')['Order'].transform('count')
But not sure how to add a new columns based on conditions where status equals to each status, So that it will look like this:
no Order materials status Total Materials available not available Feb-20
1 1000 100 available 2 1 1 0
2 1000 200 not available 2 1 1 0
3 1001 500 Feb-20 1 0 0 1
4 1002 400 available 4 3 1 0
5 1002 300 not available 4 3 1 0
6 1002 600 available 4 3 1 0
7 1002 900 available 4 3 1 0
8 1003 700 available 2 2 0 0
9 1003 800 available 2 2 0 0
Basically trying to figure out how to get the rest of the columns. Would appreciate your help!
Upvotes: 1
Views: 81
Reputation: 18367
I would do a combination of pivot table
and merge
:
ds_final = ds.merge(ds.pivot_table(values='Total Materials',index=['Order'],columns='status',aggfunc='count',fill_value=0).reset_index(),how='left',on='Order')
print(ds_final)
Output:
no Order materiales status Total Materials A Feb-20 not A
0 1 1000 100 A 2 1 0 1
1 2 1000 200 not A 2 1 0 1
2 3 1001 500 Feb-20 1 0 1 0
3 4 1002 400 A 4 3 0 1
4 5 1002 300 not A 4 3 0 1
5 6 1002 600 A 4 3 0 1
6 7 1002 900 A 4 3 0 1
7 8 1003 700 A 2 2 0 0
8 9 1003 800 A 2 2 0 0
The pivot table helps to generate the columns from the status
column. So here's the output of the pivot_table alone:
status Order A Feb-20 not A
0 1000 1 0 1
1 1001 0 1 0
2 1002 3 0 1
3 1003 2 0 0
Finally with this output, we can use merge
or concat
to generate the desired output.
Upvotes: 2