singularity2047
singularity2047

Reputation: 1071

How to compare two columns and return value from a third column in Pandas dataframe

sample data

sample = ({'name':['Delinquency Rate','Cumulative Probability'],
           'value':['Dlnqy','Prbblty'],
           'new_name':['Dlnqncy Rt','Cmltv Prbblty']})

test = pd.DataFrame(sample)

test

new_name is created by removing all the vowels from 'name'. I want to compare first 3 characters of 'value' and first 3 character of 'each of the word' in 'new_name' and if they match I want to return the value from the name column. For example 'Dln' in value exists in 'Dlnqncy' of new_name, so we will return 'Delinquency' from name. The logic should work like

'if value[:3] in new_name[:3] then return name'

Following is what I have so far, which works fine if I have only two columns. But it doesn't work if I want to compare value and new_name and retun name.

def get_matches(name, value, new_name, default=''):
    return next( (word for word in new_name.split() if str(value)[:3] in word[:3]),default)


test['match'] = test[['name', 'value', 'new_name']].apply(lambda row: get_matches(*row, default=' '), axis=1)

In the following table the column 'match' should have 'Delinquency' and 'Probability' (as it is there in the 'name' column)

enter image description here

Upvotes: 0

Views: 382

Answers (1)

William Rosenbaum
William Rosenbaum

Reputation: 355

Here you go:

(test
 .assign(match=lambda x: np.where((x.new_name.str[:3] == x.value.str[:3]), x.name, x.value))
)

Upvotes: 1

Related Questions