Reputation: 1815
I want to call pandas dataframe apply()
function to return two variables
For examples:
print(word_list)
['abc', 'lmn', ]
def is_related_content(x):
for y in word_list:
if y in x:
return x, y
return '', ''
print(df.head())
str1
abcdef
hijklmn
asddada
# call apply() function like this
df['string'], df['substring'] = df['str1'].apply(lambda x: is_related_content(x))
# it should be like this
print(df.head())
str1 string substring
abcdef abcdef abc
hijklmn hijklmn lmn
asddada None None
But I got error messages as follows:
news_df['merge_' + col], news_df[col] = news_df['content'].fillna("").apply(lambda x: is_related_content(x))
ValueError: too many values to unpack (expected 2)
Could anyone help me?
Thanks in advance.
Upvotes: 0
Views: 453
Reputation: 214927
You need a tuple of Series for the unpacking syntax to work. But apply
method is returning a Series of tuples. You can use .str
accessor after apply
in order to unpack the result as a tuple:
Updates:
s = df['str1'].apply(lambda x: is_related_content(x))
df['string'], df['substring'] = s.str[0], s.str[1]
df
# str1 string substring
#0 abcdef abcdef abc
#1 hijklmn hijklmn lmn
#2 asddada
df['string'], df['substring'] = df['str1'].apply(lambda x: is_related_content(x)).str
df
# str1 string substring
#0 abcdef abcdef abc
#1 hijklmn hijklmn lmn
#2 asddada
Upvotes: 1
Reputation: 18406
The function is_related_content
is returning tuple for each values in the column the function is applied, so trying to assign the value like that won't work, since each rows will have tuple of values. One solution would be to apply pd.Series
to each individual tuples, and assign them back to the list of the columns for the dataframe; the idea is to split the tuples to multiple columns (similar to explode
which splits the values to multiple rows):
>>> df[['string', 'substring']] = df['str1'].apply(is_related_content).apply(pd.Series)
>>> df
str1 string substring
0 abcdef abcdef abc
1 hijklmn hijklmn lmn
2 asddada
Upvotes: 1