How to call pandas dataframe apply function to return two variables

Question

I want to call pandas dataframe apply() function to return two variables

For examples:

print(word_list)
['abc', 'lmn', ]

def is_related_content(x):
    for y in word_list:
        if y in x:
            return x, y
    return '', ''

print(df.head())
    str1        
    abcdef      
    hijklmn     
    asddada    
    
# call apply() function like this
df['string'], df['substring'] = df['str1'].apply(lambda x: is_related_content(x))

# it should be like this
print(df.head())
    str1        string      substring
    abcdef      abcdef      abc
    hijklmn     hijklmn     lmn
    asddada     None        None

But I got error messages as follows:

news_df['merge_' + col], news_df[col] = news_df['content'].fillna("").apply(lambda x: is_related_content(x))
ValueError: too many values to unpack (expected 2)

Could anyone help me?
Thanks in advance.

ThePyGuy · Accepted Answer

The function is_related_content is returning tuple for each values in the column the function is applied, so trying to assign the value like that won't work, since each rows will have tuple of values. One solution would be to apply pd.Series to each individual tuples, and assign them back to the list of the columns for the dataframe; the idea is to split the tuples to multiple columns (similar to explode which splits the values to multiple rows):

>>> df[['string', 'substring']] = df['str1'].apply(is_related_content).apply(pd.Series)
>>> df
      str1   string substring
0   abcdef   abcdef       abc
1  hijklmn  hijklmn       lmn
2  asddada

How to call pandas dataframe apply function to return two variables

Answers (2)

Related Questions