Nathan Silver
Nathan Silver

Reputation: 1

Looping lambda function across multiple panda columns

I am struggling to loop a lambda function across multiple columns.

samp = pd.DataFrame({'ID':['1','2','3'], 'A':['1C22', '3X35', '2C77'],
                     'B': ['1C35', '2C88', '3X99'], 'C':['3X56', '2C73', '1X91']})

Essentially, I am trying to add three columns to this dataframe with a 1 if there is a 'C' in the string and a 0 if not (i.e. an 'X').

This function works fine when I apply it as a lambda function to each column individually, but I'm doing so to 40 differnt columns and the code is (I'm assuming) unnecessarily clunky:

def is_correct(str):
    correct = len(re.findall('C', str))
    return correct

samp.A_correct=samp.A.apply(lambda x: is_correct(x))
samp.B_correct=samp.B.apply(lambda x: is_correct(x))
samp.C_correct=samp.C.apply(lambda x: is_correct(x))

I'm confident there is a way to loop this, but I have been unsuccessful thus far.

Upvotes: 0

Views: 414

Answers (2)

gtomer
gtomer

Reputation: 6564

You can iterate over the columns:

import pandas as pd
import re

df = pd.DataFrame({'ID':['1','2','3'], 'A':['1C22', '3X35', '2C77'],
                     'B': ['1C35', '2C88', '3X99'], 'C':['3X56', '2C73', '1X91']})
def is_correct(str):
    correct = len(re.findall('C', str))
    return correct

for col in df.columns:
    df[col + '_correct'] = df[col].apply(lambda x: is_correct(x))

Upvotes: 1

Quang Hoang
Quang Hoang

Reputation: 150735

Let's try apply and join:

samp.join(samp[['A','B','C']].add_suffix('_correct')
                .apply(lambda x: x.str.contains('C'))
                .astype(int)
        ) 

Output:

  ID     A     B     C  A_correct  B_correct  C_correct
0  1  1C22  1C35  3X56          1          1          0
1  2  3X35  2C88  2C73          0          1          1
2  3  2C77  3X99  1X91          1          0          0

Upvotes: 0

Related Questions