Reputation:
I have excel
ID digit count
03a63f1c5a89fb89fcc4d7cf60e2e6b1 131826356 1
1ea192ddd5c042d71910de18595553a5 100897602 5
1ea192ddd5c042d71910de18595553a5 123581809 7
1ea192ddd5c042d71910de18595553a5 137961455 1
1ea192ddd5c042d71910de18595553a5 163217715 4
1ea192ddd5c042d71910de18595553a5 164622895 3
206dcb0a5bfbdbedb77b6f25bbb4b54b 139156767 12
21a27ac254d8b2fe0a52d052bbbd14a5 267483648 1
I need to sum
all count
to every ID
and next divide every count
to this value.
I try
df = pd.read_excel("count_video.xlsx")
group = df.groupby('ID')['count'].sum()
But I don't know, how can I use it next?
Upvotes: 0
Views: 22
Reputation: 393963
You can use a lambda
to divide each count value by the grouped sum
:
In [155]:
df['count/sum'] = df.groupby('ID')['count'].transform(lambda x: x/x.sum())
df
Out[155]:
ID digit count count/sum
0 03a63f1c5a89fb89fcc4d7cf60e2e6b1 131826356 1 1.00
1 1ea192ddd5c042d71910de18595553a5 100897602 5 0.25
2 1ea192ddd5c042d71910de18595553a5 123581809 7 0.35
3 1ea192ddd5c042d71910de18595553a5 137961455 1 0.05
4 1ea192ddd5c042d71910de18595553a5 163217715 4 0.20
5 1ea192ddd5c042d71910de18595553a5 164622895 3 0.15
6 206dcb0a5bfbdbedb77b6f25bbb4b54b 139156767 12 1.00
7 21a27ac254d8b2fe0a52d052bbbd14a5 267483648 1 1.00
Upvotes: 1