Python3 dataframe restructure

Question

I have a dataframe that consists of 2 columns; first columns 'Features', contains 11 unique values stacked vertically 148037 times, with the 'Values' column having different corresponding values.

    Feature         Value
0   Way ID          2781002
1   Highway         motorway_link
2   Toll            yes
3   Reference       n/a
4   Bridge          yes
5   County          n/a
6   Name            n/a
7   Name2           n/a
8   Name3           n/a
9   Name4           n/a
10  Name Type       n/a
11  Way ID          2788620
12  Highway         motorway
13  Toll            yes
14  Reference       A 49
15  Bridge          n/a
16  County          n/a
17  Name            n/a
18  Name2           n/a
19  Name3           n/a
20  Name4           n/a
21  Name Type       n/a
22  Way ID          2954156
... ... ...
148026  Name Type   n/a
148027  Way ID      545273699
148028  Highway     motorway
148029  Toll        yes
148030  Reference   A 4
148031  Bridge      n/a
148032  County      n/a
148033  Name        Autoroute de l'Est
148034  Name2       n/a
148035  Name3       n/a
148036  Name4       n/a
148037  Name Type   n/a

I'd like to have the dataframe set up horizontally in the following way:

    Feature         Value_1           Value_2        Value_3
0   Way ID          2781002           2788620        2954156
1   Highway         motorway_link     motorway       motorway
2   Toll            yes               yes            yes       
3   Reference       n/a               n/a            n/a
4   Bridge          yes               yes            no
... ... ...    
10  Name Type       n/a               n/a            n/a

How can I do this? I tried to use a loop and create a new df/list for every new set of 11 rows, then concatenate them together. Problem is, I can't create new df with different names using '{}'.format(i) syntax within a loop, it doesn't like it for some reason.

Scott Boston · Accepted Answer

Try this:

df.groupby('Feature')['Value'].apply(lambda x: pd.Series(x.tolist())).unstack().add_prefix('Value_')

Using groupby, apply and unstack.

Python3 dataframe restructure

Answers (2)

Related Questions