Ashley O
Ashley O

Reputation: 1200

Similar to a Pivot in Pandas

I have a stupid problem that I can't seem to solve. I need to take a pandas dataframe like so:

id     part1    part2   part3   part4   part5
23024   xyz9     23l    lk8     jkd9    298
48392   xyz10    24x    29x      ef3    298

Now, i just want to "pivot" the table so that there are two columns displaying the ID and the parts:

id    Parts
23024  xyz9
23024  23l
23024  lk8
23024  jkd9
23024  298
48392  xyz10
48392  29x
48392  ef3
48392  298

I started by getting all the unique IDs:

lst =[]
for i in df['id']:
   if i is not None:
      lst.append(i)

Then tried a few functions to get the values, but failed:

    df_new = pd.DataFrame(columns=['id', 'parts'])
    for i in lst:
        temp_df = df.loc[df['id'] == i]
        for x in temp_df.columns:
            temp_list = []
            temp_list.append([str(i), str(temp_df[x][0])])
            temp_df = pd.DataFrame(temp_list, columns=['id', 'parts'])
            df = df.append(temp_df)

But I keep getting KeyErrors. What am I doing wrong?

Upvotes: 0

Views: 54

Answers (2)

BENY
BENY

Reputation: 323326

So here are two more options melt and wide_to_long : personally I recommend second one since we do not loss any information after reshape , we still have the part number

df.melt('id')
Out[167]: 
      id variable  value
0  23024    part1   xyz9
1  48392    part1  xyz10
2  23024    part2    23l
3  48392    part2    24x
4  23024    part3    lk8
5  48392    part3    29x
6  23024    part4   jkd9
7  48392    part4    ef3
8  23024    part5    298
9  48392    part5    298
pd.wide_to_long(df,['part'],i='id',j='number')
Out[168]: 
               part
id    number       
23024 1        xyz9
48392 1       xyz10
23024 2         23l
48392 2         24x
23024 3         lk8
48392 3         29x
23024 4        jkd9
48392 4         ef3
23024 5         298
48392 5         298

Upvotes: 3

busybear
busybear

Reputation: 10590

An alternative method is to set 'id' as the index and stack the remaining columns:

df_new = df.set_index('id').stack().reset_index()

If needed, you can select the columns you want to keep if that's important:

df_new[['id', 0]].rename(columns={0: 'Parts'})

Output:

      id  Parts
0  23024   xyz9
1  23024    23l
2  23024    lk8
3  23024   jkd9
4  23024    298
5  48392  xyz10
6  48392    24x
7  48392    29x
8  48392    ef3
9  48392    298

Upvotes: 1

Related Questions