Reputation: 5473
I have two columns in a data frame containing sets.
How do I get a new column where each row contains the union of the items from the respective columns?
For example:
col1 : [{1,2} , {4,5}]
col2 : [{1,6} , {7,5}]
union : [{1,2,6}, {4,5,7}]
A naive try:
df['union'] = df['col1'].apply(lambda x: x.union(df['col2']))
does not work
Upvotes: 2
Views: 14208
Reputation: 863281
I think you are very close - use apply
with axis=1
:
import pandas as pd
df = pd.DataFrame([[{1,2} , {1,6}], [{4,5} , {7,5}]], columns=['col1', 'col2'])
df['union'] = df.apply(lambda x: x['col1'].union(x['col2']), axis=1)
print (df)
col1 col2 union
0 {1, 2} {1, 6} {1, 2, 6}
1 {4, 5} {5, 7} {4, 5, 7}
Another solution with |
docs:
df['union'] = df.apply(lambda x: (x['col1'] | x['col2']), axis=1)
print (df)
col1 col2 union
0 {1, 2} {1, 6} {1, 2, 6}
1 {4, 5} {5, 7} {4, 5, 7}
Upvotes: 4