Mahamutha M
Mahamutha M

Reputation: 1287

How to take mean of column list using groupby in pandas data frame?

I want to take a mean of list of values in a column from pandas data frame. The actual data frame which I have is,

df:
mac           gw_mac        ibeaconMajor  ibeaconMinor  rssi
ac233f264920  ac233fc015f6  1             1             [-32, -45]
ac233f26492b  ac233fc015f6  0             0             [-65, -52]
ac233f264933  ac233fc015f6  1             2             [-69, -73]

and the required outcome is,

df:
mac           gw_mac        ibeaconMajor  ibeaconMinor  rssi
ac233f264920  ac233fc015f6  1             1             -38.5
ac233f26492b  ac233fc015f6  0             0             -58.5
ac233f264933  ac233fc015f6  1             2             -71

I have tried with the following functionality and unable to get the required response.

df.assign(rssi=pd.to_numeric(df['rssi'], errors='coerce')) \
              .groupby(['mac','gw_mac','ibeaconMajor','ibeaconMinor']) 
               ['rssi'].mean()

Upvotes: 1

Views: 318

Answers (2)

Ferran
Ferran

Reputation: 840

df = pd.DataFrame([['ac233f264920','ac233fc015f6',  1,  1, [-32, -45]],
                   ['ac233f26492b','ac233fc015f6',  0,  0, [-65, -52]],
                   ['ac233f264933','ac233fc015f6',  1,  2, [-69, -73]],
                   ], columns=['mac', 'gw_mac','ibeaconMajor','ibeaconMinor', 'rssi'])

df
            mac        gw_mac  ibeaconMajor  ibeaconMinor        rssi
0  ac233f264920  ac233fc015f6             1             1  [-32, -45]
1  ac233f26492b  ac233fc015f6             0             0  [-65, -52]
2  ac233f264933  ac233fc015f6             1             2  [-69, -73]

Compute mean

means = [np.mean(x) for x in df['rssi']]

Replace column

df['rssi'] = means

df
            mac        gw_mac  ibeaconMajor  ibeaconMinor  rssi
0  ac233f264920  ac233fc015f6             1             1 -38.5
1  ac233f26492b  ac233fc015f6             0             0 -58.5
2  ac233f264933  ac233fc015f6             1             2 -71.0

Upvotes: 1

Alexandre B.
Alexandre B.

Reputation: 5502

Try apply:

df['rssi'] = df.rssi.apply(np.mean)

Full example:

data = [["ac233f264920",  "ac233fc015f6",  1, 1, [-32, -45]],
        ["ac233f26492b",  "ac233fc015f6",  0, 0, [-65, -52]],
        ["ac233f264933",  "ac233fc015f6",  1, 2, [-69, -73]]]


df = pd.DataFrame(data, columns=["mac", "gw_mac", "ibeaconMajor",  "ibeaconMinor",  "rssi"])

df['rssi'] = df.rssi.apply(np.mean)
print(df)
#             mac        gw_mac  ibeaconMajor  ibeaconMinor  rssi
# 0  ac233f264920  ac233fc015f6             1             1 -38.5
# 1  ac233f26492b  ac233fc015f6             0             0 -58.5
# 2  ac233f264933  ac233fc015f6             1             2 -71.0

Upvotes: 1

Related Questions