Reputation: 1200
I have a stupid problem that I can't seem to solve. I need to take a pandas dataframe like so:
id part1 part2 part3 part4 part5
23024 xyz9 23l lk8 jkd9 298
48392 xyz10 24x 29x ef3 298
Now, i just want to "pivot" the table so that there are two columns displaying the ID and the parts:
id Parts
23024 xyz9
23024 23l
23024 lk8
23024 jkd9
23024 298
48392 xyz10
48392 29x
48392 ef3
48392 298
I started by getting all the unique IDs:
lst =[]
for i in df['id']:
if i is not None:
lst.append(i)
Then tried a few functions to get the values, but failed:
df_new = pd.DataFrame(columns=['id', 'parts'])
for i in lst:
temp_df = df.loc[df['id'] == i]
for x in temp_df.columns:
temp_list = []
temp_list.append([str(i), str(temp_df[x][0])])
temp_df = pd.DataFrame(temp_list, columns=['id', 'parts'])
df = df.append(temp_df)
But I keep getting KeyErrors. What am I doing wrong?
Upvotes: 0
Views: 54
Reputation: 323326
So here are two more options melt
and wide_to_long
: personally I recommend second one since we do not loss any information after reshape , we still have the part number
df.melt('id')
Out[167]:
id variable value
0 23024 part1 xyz9
1 48392 part1 xyz10
2 23024 part2 23l
3 48392 part2 24x
4 23024 part3 lk8
5 48392 part3 29x
6 23024 part4 jkd9
7 48392 part4 ef3
8 23024 part5 298
9 48392 part5 298
pd.wide_to_long(df,['part'],i='id',j='number')
Out[168]:
part
id number
23024 1 xyz9
48392 1 xyz10
23024 2 23l
48392 2 24x
23024 3 lk8
48392 3 29x
23024 4 jkd9
48392 4 ef3
23024 5 298
48392 5 298
Upvotes: 3
Reputation: 10590
An alternative method is to set 'id'
as the index and stack the remaining columns:
df_new = df.set_index('id').stack().reset_index()
If needed, you can select the columns you want to keep if that's important:
df_new[['id', 0]].rename(columns={0: 'Parts'})
Output:
id Parts
0 23024 xyz9
1 23024 23l
2 23024 lk8
3 23024 jkd9
4 23024 298
5 48392 xyz10
6 48392 24x
7 48392 29x
8 48392 ef3
9 48392 298
Upvotes: 1