Nurbek Kuantyrov
Nurbek Kuantyrov

Reputation: 177

Python count column values by other column

I have the table such that:

no  Order   materials   status
1   1000    100         available
2   1000    200         not available
3   1001    500         Feb-20
4   1002    400         available
5   1002    300         not available
6   1002    600         available
7   1002    900         available
8   1003    700         available
9   1003    800         available

And I'd like to have columns that shows:

  1. Total number of materials per Order
  2. Total number of materials with their status per Order

I was able to get the total number of materials per Order:

ds.groupby('Order').count()
ds['Total Materials'] = ds.groupby('Order')['Order'].transform('count')

But not sure how to add a new columns based on conditions where status equals to each status, So that it will look like this:

no  Order   materials   status         Total Materials  available   not available   Feb-20
1   1000    100         available       2                    1            1           0
2   1000    200         not available   2                    1            1           0
3   1001    500         Feb-20          1                    0            0           1
4   1002    400         available       4                    3            1           0
5   1002    300         not available   4                    3            1           0
6   1002    600         available       4                    3            1           0
7   1002    900         available       4                    3            1           0
8   1003    700         available       2                    2            0           0
9   1003    800         available       2                    2            0           0

Basically trying to figure out how to get the rest of the columns. Would appreciate your help!

Upvotes: 1

Views: 81

Answers (1)

Celius Stingher
Celius Stingher

Reputation: 18367

I would do a combination of pivot table and merge:

ds_final = ds.merge(ds.pivot_table(values='Total Materials',index=['Order'],columns='status',aggfunc='count',fill_value=0).reset_index(),how='left',on='Order')
print(ds_final)

Output:

   no  Order  materiales  status  Total Materials  A  Feb-20  not A
0   1   1000         100       A                2  1       0      1
1   2   1000         200   not A                2  1       0      1
2   3   1001         500  Feb-20                1  0       1      0
3   4   1002         400       A                4  3       0      1
4   5   1002         300   not A                4  3       0      1
5   6   1002         600       A                4  3       0      1
6   7   1002         900       A                4  3       0      1
7   8   1003         700       A                2  2       0      0
8   9   1003         800       A                2  2       0      0

Some extra explanation:

The pivot table helps to generate the columns from the status column. So here's the output of the pivot_table alone:

status  Order  A  Feb-20  not A
0        1000  1       0      1
1        1001  0       1      0
2        1002  3       0      1
3        1003  2       0      0

Finally with this output, we can use merge or concat to generate the desired output.

Upvotes: 2

Related Questions