Reshape column to multiple columns

Question

I'm rather new to Python and i'm having some troubles. I have the following dataframe:

import pandas as pd
data = {'v1':('Belgium[country]', 'Antwerp[city]', 'Gent[city]', 'France[country]', 'Paris[city]', 'Marseille[city]', 'Toulouse[city]', 'Spain[country]', 'Madrid[city]')}
df = pd.DataFrame(data)
df

   v1
0  Belgium[country]
1  Antwerp[city]
2  Gent[city]
3  France[country]
4  Paris[city]
5  Marseille[city]
6  Toulouse[city]
7  Spain[country]
8  Madrid[city]

Which I want to map to the following format:

   v1                v2
0  Belgium[country]  Antwerp[city]
1  Belgium[country]  Gent[city]
2  France[country]   Paris[city]
3  France[country]   Marseille[city]
4  France[country]   Toulouse[city]
5  Spain[country]    Madrid[city]

I found a way to do this using a dictionary, but since I want to maintain the order I'm looking for a way to do this using a list or so.

I tried it both based on the indexes and on the values themselves (specifically [country] and [city]), but i failed with both. Any help is much appreciated!

IanS · Accepted Answer

This will work:

counter = df['v1'].str.contains('country').cumsum()
result = df.groupby(counter).apply(lambda g: g[1:]).reset_index(level=1, drop=True)
result = result.rename(columns={'v1': 'v2'}).reset_index(drop=False)
result['v1'] = result['v1'].replace(df.groupby(counter).first().squeeze())

The idea is to add a counter that is incremented for each new country. You can then group by this counter to access the information you need.

Specifically, the first step is to keep only cities (g[1:] for each group g). Then do some renaming and reindexing. Finally, use the result from another groupby (giving the country) to replace the values in column v1.

Reshape column to multiple columns

Answers (2)

Related Questions