user6230169
user6230169

Reputation:

Computing with columns using pandas

I have excel

ID  digit   count
03a63f1c5a89fb89fcc4d7cf60e2e6b1    131826356   1
1ea192ddd5c042d71910de18595553a5    100897602   5
1ea192ddd5c042d71910de18595553a5    123581809   7
1ea192ddd5c042d71910de18595553a5    137961455   1
1ea192ddd5c042d71910de18595553a5    163217715   4
1ea192ddd5c042d71910de18595553a5    164622895   3
206dcb0a5bfbdbedb77b6f25bbb4b54b    139156767   12
21a27ac254d8b2fe0a52d052bbbd14a5    267483648   1

I need to sum all count to every ID and next divide every count to this value. I try

df = pd.read_excel("count_video.xlsx")
group = df.groupby('ID')['count'].sum()

But I don't know, how can I use it next?

Upvotes: 0

Views: 22

Answers (1)

EdChum
EdChum

Reputation: 393963

You can use a lambda to divide each count value by the grouped sum:

In [155]:
df['count/sum'] = df.groupby('ID')['count'].transform(lambda x: x/x.sum())
df

Out[155]:
                                 ID      digit  count  count/sum
0  03a63f1c5a89fb89fcc4d7cf60e2e6b1  131826356      1       1.00
1  1ea192ddd5c042d71910de18595553a5  100897602      5       0.25
2  1ea192ddd5c042d71910de18595553a5  123581809      7       0.35
3  1ea192ddd5c042d71910de18595553a5  137961455      1       0.05
4  1ea192ddd5c042d71910de18595553a5  163217715      4       0.20
5  1ea192ddd5c042d71910de18595553a5  164622895      3       0.15
6  206dcb0a5bfbdbedb77b6f25bbb4b54b  139156767     12       1.00
7  21a27ac254d8b2fe0a52d052bbbd14a5  267483648      1       1.00

Upvotes: 1

Related Questions