Julia
Julia

Reputation: 109

Choose one entry from a list if its key contains a string from another column

I have a question regarding my dataframe. Specifically, in one column, for each row, I have a list of speakers and speeches. Now, I want to choose exactly one speech, based on whether the speaker is the one I am looking for, which is noted within another column. So one column provides the last name I am looking for and the other column provides a list of all speakers (first and last name) and their speech and I want to create a new columns where this speech is stored in the respective row.

So my initial dataset looks like this:

ticker  year    quarter exel_lname  jobposition speech
xx      2009    1       Angle       CEO         [("Mike Angle", "Thank you"), ("Barbara Barth", "It is")]
xx      2009    1       Barth       CFO         [("Mike Angle", "Thank you"), ("Barbara Barth", "It is")]
xx      2009    2       Angle       CEO         [("Mike Angle", "I am surprised"), ("Barbara Barth", "So am I")]
xx      2009    2       Barth       CFO         [("Mike Angle", "I am surprised"), ("Barbara Barth", "So am I")]
yy      2008    3       Cruz        CEO         [("Damien Cruz", "Hello"), ("Lara Dolm", "Nice to meet you")]
yy      2008    3       Dolm        CFO         [("Damien Cruz", "Hello"), ("Lara Dolm", "Nice to meet you")]

For row one for instance, I want to check each key-value pair whether the first list entry contains the last name, if no continue, if yes, take the speech part (i.e. second list entry) and store it in new column. As such, I want the following dataset (I hid the initial column speech here, but it should still be contained, so I do not want to replace it, just create a new column).

ticker  year    quarter exel_lname  jobposition speechmanager
xx      2009    1       Angle       CEO         "Thank you"
xx      2009    1       Barth       CFO         "It is"
xx      2009    2       Angle       CEO         "I am surprised"
xx      2009    2       Barth       CFO         "So am I"
yy      2008    3       Cruz        CEO         "Hello"
yy      2008    3       Dolm        CFO         "Nice to meet you"

Could someone help me how to solve this in Python 3?

Thank you!! Julia

Upvotes: 1

Views: 71

Answers (1)

cmaher
cmaher

Reputation: 5215

This is perhaps best accomplished by writting a function, and then applying it row-wise:

def get_speech(row):
    matches = list(filter(lambda x: x[0].endswith(row['exel_lname']), row['speech']))
    if len(matches) > 0:
        return matches[0][1]
    return ''

df['speechmanager'] = df.apply(get_speech, axis=1)

Upvotes: 2

Related Questions