Nur Atiqah
Nur Atiqah

Reputation: 105

How to do group by and list the result in each row in a new column using Python?

I'm still learning Python thus I'm requiring some helps. I have the following data:

Product |   No_unit_tested  | Yield

A   |1  |0.320

A   |4  |0.780

B   |5  |0.900

C   |3  |0.670

C   |7  |0.540

D   |7  |1.000

D   |9  |0.800

and I want to produce the following results:

Product |No_unit_tested |Yield  |Mean

A   |1  |0.320  |0.550

A   |4  |0.780  |0.550

B   |5  |0.900  |0.900

C   |3  |0.670  |0.605

C   |7  |0.540  |0.605

D   |7  |1.000  |0.900

D   |9  |0.800  |0.900

by using df = df.groupby('Product')['Yield'].mean() I manage to get the mean for every product but I'm not able to produce the results that I want. How can I do it in Python using pandas?

Upvotes: 0

Views: 46

Answers (2)

Red
Red

Reputation: 27577

Here is what you can do:

s1 = '''Product | No_unit_tested | Yield

A |1 |0.320

A |4 |0.780

B |5 |0.900

C |3 |0.670

C |7 |0.540

D |7 |1.000

D |9 |0.800'''

d = {}
s2 = [n.strip() for n in s1.replace('|','\n').split()]
for n in range(5,len(s2),3):
    if s2[n-2] in d.keys():
        d[s2[n-2]].append(float(s2[n]))
    else:
        d[s2[n-2]] = [float(s2[n])]

s3 = [s1.split('\n\n')[0]+' |Mean']
for k in d.keys():    
    for l in s1.split('\n'):
        if k in l:
            s3.append(l+f' |{"%.3f"%float(sum(d[k])/len(d[k]))}')

print('\n\n'.join(s3))

Output:

Product| No_unit_tested| Yield |Mean

A |1 |0.320 |0.550

A |4 |0.780 |0.550

B |5 |0.900 |0.900

C |3 |0.670 |0.605

C |7 |0.540 |0.605

D |7 |1.000 |0.900

D |9 |0.800 |0.900

Upvotes: 0

Balaji Ambresh
Balaji Ambresh

Reputation: 5012

Here you go:

import pandas as pd
from io import StringIO

df = pd.read_csv(StringIO(
    """Product|No_unit_tested|Yield
A|1|0.320
A|4|0.780
B|5|0.900
C|3|0.670
C|7|0.540
D|7|1.000
D|9|0.800"""
), sep='|')
means = df.groupby('Product')['Yield'].mean()
means.name = 'Mean'
result = df.set_index('Product').join(means).reset_index()
print(result)

Output:

  Product  No_unit_tested  Yield   Mean
0       A               1   0.32  0.550
1       A               4   0.78  0.550
2       B               5   0.90  0.900
3       C               3   0.67  0.605
4       C               7   0.54  0.605
5       D               7   1.00  0.900
6       D               9   0.80  0.900

Upvotes: 1

Related Questions