Reputation: 3385
This question is somewhat motivated from a previous question I asked - Pandas groupby make two columns lists separately. This time I want to create a new column where each value is a single list that contains tuples of the zipped values from the other two columns. For example:
# Original DataFrame
fruit sport weather
0 apple [baseball, basketball] [sunny, windy]
1 banana [swimming, hockey] [cloudy, windy]
2 orange [football] [sunny]
# Desired DataFrame
fruit sport weather pairs
0 apple [baseball, basketball] [sunny, windy] [(baseball, sunny), (basketball, windy)]
1 banana [swimming, hockey] [cloudy, windy] [(swimming, cloudy), (hocky, windy)]
2 orange [football] [sunny] [(football, sunny)]
I've tried the following code, but it gives me something else:
df['pairs'] = list(zip(df['sport'], df['weather']))
# Output DataFrame
fruit sport weather pairs
0 apple [baseball, basketball] [sunny, windy] ([baseball, sunny], [basketball, windy])
1 banana [swimming, hockey] [cloudy, windy] ([swimming, cloudy], [hocky, windy])
2 orange [football] [sunny] ([football], [sunny])
As you can see, it's "reversed" from what I want to do. What is the appropriate way that I should go about this? Thanks in advance.
Upvotes: 0
Views: 1547
Reputation: 61910
You could take advantage of the fact that map has an embedded zip, and do:
df['pairs'] = [list(x) for x in map(zip, df['sport'], df['weather'])]
print(df)
Output
fruit ... pairs
0 apple ... [(baseball, sunny), (basketball, windy)]
1 banana ... [(swimming, cloudy), (hockey, windy)]
2 orange ... [(football, sunny)]
[3 rows x 4 columns]
Or you could use itertuples:
df['pairs'] = [list(zip(*x)) for x in df[['sport', 'weather']].itertuples(index=False)]
Upvotes: 1
Reputation: 42906
Use DataFrame.apply
over axis=1
with zip
:
df['pairs'] = df.apply(lambda x: list(zip(x['sport'], x['weather'])), axis=1)
fruit sport weather pairs
0 apple [baseball, basketball] [sunny, windy] [(baseball, sunny), (basketball, windy)]
1 banana [swimming, hockey] [cloudy, windy] [(swimming, cloudy), (hockey, windy)]
2 orange [football] [sunny] [(football, sunny)]
Upvotes: 1
Reputation: 150735
I think you are missing another list(zip())
:
df['pairs'] = list(list(zip(a,b)) for a,b in zip(df['sport'], df['weather']))
Output:
fruit sport weather pairs
0 apple ['baseball', 'basketball'] ['sunny', 'windy'] [('baseball', 'sunny'), ('basketball', 'windy')]
1 banana ['swimming', 'hockey'] ['cloudy', 'windy'] [('swimming', 'cloudy'), ('hockey', 'windy')]
2 orange ['football'] ['sunny'] [('football', 'sunny')]
Upvotes: 2