Reputation: 13
I have a dataframe that consists of 2 columns; first columns 'Features', contains 11 unique values stacked vertically 148037 times, with the 'Values' column having different corresponding values.
Feature Value
0 Way ID 2781002
1 Highway motorway_link
2 Toll yes
3 Reference n/a
4 Bridge yes
5 County n/a
6 Name n/a
7 Name2 n/a
8 Name3 n/a
9 Name4 n/a
10 Name Type n/a
11 Way ID 2788620
12 Highway motorway
13 Toll yes
14 Reference A 49
15 Bridge n/a
16 County n/a
17 Name n/a
18 Name2 n/a
19 Name3 n/a
20 Name4 n/a
21 Name Type n/a
22 Way ID 2954156
... ... ...
148026 Name Type n/a
148027 Way ID 545273699
148028 Highway motorway
148029 Toll yes
148030 Reference A 4
148031 Bridge n/a
148032 County n/a
148033 Name Autoroute de l'Est
148034 Name2 n/a
148035 Name3 n/a
148036 Name4 n/a
148037 Name Type n/a
I'd like to have the dataframe set up horizontally in the following way:
Feature Value_1 Value_2 Value_3
0 Way ID 2781002 2788620 2954156
1 Highway motorway_link motorway motorway
2 Toll yes yes yes
3 Reference n/a n/a n/a
4 Bridge yes yes no
... ... ...
10 Name Type n/a n/a n/a
How can I do this? I tried to use a loop and create a new df/list for every new set of 11 rows, then concatenate them together. Problem is, I can't create new df with different names using '{}'.format(i) syntax within a loop, it doesn't like it for some reason.
Upvotes: 1
Views: 60
Reputation: 323366
By using pivot_table
pd.pivot_table(df,index=df.Feature,columns=df.groupby('Feature').cumcount().add(1),values='Value',aggfunc='sum').add_prefix('Value_')
Out[779]:
Value_1 Value_2 Value_3
Feature
Bridge yes n/a None
County n/a n/a None
Highway motorway_link motorway None
Name n/a n/a None
Name2 n/a n/a None
Name3 n/a n/a None
Name4 n/a n/a None
NameType n/a n/a None
Reference n/a A49 None
Toll yes yes None
WayID 2781002 2788620 2954156
Upvotes: 1
Reputation: 153510
Try this:
df.groupby('Feature')['Value'].apply(lambda x: pd.Series(x.tolist())).unstack().add_prefix('Value_')
Using groupby
, apply
and unstack
.
Upvotes: 1