Reputation: 23
I have pandas data frame like this
Name Col1
Foo [A, C, B, D, E]
Bar [G, M, O]
Baz [E, B]
I want to change it to:
Name New_Col1 New_Col2
Foo A C
Foo C B
Foo B D
Foo D E
Bar G M
Bar M O
Baz E B
How can I do that?
Upvotes: 1
Views: 115
Reputation: 1746
Original data:
df = pd.DataFrame({'name': ['Foo', 'Bar', 'Baz'],
'col1': [['A', 'C', 'B', 'D', 'E'], ['G', 'M', 'O'], ['E', 'B']]})
This step extract each values of col1
values
df = df.explode('col1')
Creating a new column by leading col1
df['New_Col2'] = df.groupby('name')['col1'].shift(-1)
Drop nam from last rows within each group and rename col1
df = df.dropna()
df.rename(columns={'col1': 'New_Col1'})
Final output
print(df)
This is similar to mozway's answer. But maybe bit more readable.
Upvotes: 0
Reputation: 261860
Using explode
and GroupBy.shift
:
s = df['Col1'].explode()
df.drop(columns='Col1').join(
pd.concat([s, s.groupby(level=0).shift(-1)], axis=1)
.set_axis(['New_Col1', 'New_Col2'], axis=1)
.dropna()
)
output:
Name New_Col1 New_Col2
0 Foo A C
0 Foo C B
0 Foo B D
0 Foo D E
1 Bar G M
1 Bar M O
2 Baz E B
With the approach, you can easily generalize to more combinations, example with 3:
s = df['Col1'].explode()
n = 3
df.drop(columns='Col1').join(
pd.concat([s.groupby(level=0).shift(-x) for x in range(n)], axis=1)
.set_axis([f'New_Col{i+1}' for i in range(n)], axis=1)
.dropna()
)
output:
Name New_Col1 New_Col2 New_Col3
0 Foo A C B
0 Foo C B D
0 Foo B D E
1 Bar G M O
2 Baz NaN NaN NaN
Upvotes: 0