Reputation: 11180
I have a DataFrame as shown below:
First I would like to get the overall frequencies of the CODE
values, call it FREQ
, then frequencies of the CODE
values within each AXLE
group and call it GROUP_FREQ
.
I was able to calculate the FREQ
column using the below code:
pivot = df[['AXLES','CODE']].pivot(['CODE']).agg(['count','mean','min','max'])
pivot['FREQ']=grouped_df.AXLES['count']/pivot.AXLES['count'].sum()*100`
this provides a nice grouped DataFrame as shown below:
However, I could not figure out how to calculate the frequencies within each AXLE
group using this grouped_by
DataFrame in the next step.
I tried:
pivot['GROUPFREQ']=pivot['AXLES','mean']['count']/pivot['AXLES','mean']['count'].sum()*100
However, this gives a KeyError: 'count'
.
I may be on the wrong path, and what I am trying to achieve may not be done using groupby
. I decided to check with the community after spending a couple of hours of trial and errors. I'd be glad if you could let me know what you think.
Thanks!
Reproducible input DataFrame:
,CODE,AXLES
0,0101,5
1,001,4
2,0110111,8
3,010111,7
4,0100,5
5,0101,5
6,0110111,8
7,00111,6
8,00111,6
9,0110111,8
10,0100,5
11,0110011,8
12,01011,6
13,0110111,8
14,0110111,8
15,011011,7
16,011011,7
17,011011,7
18,01011,6
19,01011,6
Desired Output for pivot
DataFrame:
CODE,COUNT,AXLES,FREQ,GROUPFREQ
001,1,4,0.05,1.00
00111,2,6,0.1,0.40
0100,2,5,0.1,0.50
0101,2,5,0.1,0.50
01011,3,6,0.15,0.60
010111,1,7,0.05,0.25
0110011,1,8,0.05,0.17
011011,3,7,0.15,0.75
0110111,5,8,0.25,0.83
For the first line of the output: 001 is seen only once in the whole data set (20 records). Thus FREQ = 1/20 = 0.05 When the data is grouped by AXLES, for the AXLES=4 group, 001 is the only record, thus the GROUPFREQ = 1/1 = 1.00. (The same code cannot occur under various AXLE groups, so 001 only needs to be checked for AXLES=4.)
Upvotes: 0
Views: 1478
Reputation: 150785
Do you mean:
pivot['FREQ'] = df.groupby('AXLES').CODE.value_counts(normalize=True).reset_index(level=0,drop=True)
Output:
AXLES FREQ
count mean min max
CODE
1 1 4 4 4 1.000000
100 2 5 5 5 0.500000
101 2 5 5 5 0.500000
111 2 6 6 6 0.400000
1011 3 6 6 6 0.600000
10111 1 7 7 7 0.250000
11011 3 7 7 7 0.750000
110011 1 8 8 8 0.166667
110111 5 8 8 8 0.833333
Upvotes: 1