user6106624
user6106624

Reputation:

Loop and Accumulate Sum from Pandas Column Made of Lists

Currently, my Pandas data frame looks like the following

Row_X
["Medium, "High", "Low"]
["Medium"]

My intention is to iterate through the list in each row such that:

summation = 0

for value in df["Row_X"]:
     if "High" in value:
          summation = summation + 10
     elif "Medium" in value:
          summation = summation + 5
     else:
          summation= summation + 0

Finally, I wish to apply this to each and create a new column that looks like the following:

Row_Y
15
10

My assumption is that either np.select() or apply() can play into this but thus far have encountered errors with implementing either.

Upvotes: 0

Views: 1111

Answers (4)

ansev
ansev

Reputation: 30920

We can do:

mapper = {'Medium' : 5, 'High' : 10}

df['Row_Y'] = [sum([mapper[word] for word in l 
                    if word in mapper]) 
               for l in df['Row_X']]

If pandas version > 0.25.0 We can use

df['Row_Y'] = df['Row_X'].explode().map(mapper).sum(level=0)

print(df)

                 Row_X  Row_Y
0  [Medium, High, Low]     15
1             [Medium]      5

Upvotes: 4

Kenan
Kenan

Reputation: 14094

Map your function to the series

import pandas as pd

def function(x):
    summation = 0
    for i in x:
        if "High" in i:
            summation += 10
        elif "Medium" in i:
            summation += 5
        else:
            summation += 0
    return summation


df = pd.DataFrame({'raw_x': [['Medium', 'High', 'Low'], ['Medium']]})
df['row_y'] = df['raw_x'].map(function)

You can do it in a shorter format with

mapping = {'High': 10, 'Medium': 5, 'Low': 0}
df['raw_y'] = df['raw_x'].map(lambda x: sum([mapping[i] if i in mapping else 0 for i in x]))
print(df)

                 raw_x  row_y
0  [Medium, High, Low]     15
1             [Medium]      5

Upvotes: 1

ggaurav
ggaurav

Reputation: 1804

Maybe on a cleaner side, convert to a Series and directly use map

mapp = {'Medium' : 5, 'High' : 10}
df['Row_Y'] = df['Row_X'].apply(lambda x: pd.Series(x).map(mapp).sum())
df

            Row_X         Row_Y
0   [Medium, High, Low]   15.0
1   [Medium]              5.0

Upvotes: 1

KVEER
KVEER

Reputation: 123

This solution should work -

vals = {"High":10, "Medium":5, "Low":0}
df['Row_Y'] = df.apply(lambda row : sum(vals[i] for i in row['Row_X']) ,axis=1) 

Upvotes: 0

Related Questions