Silvana Albu
Silvana Albu

Reputation: 21

Pandas: going from long to wide format in a dataframe

I am having trouble going from a long format to a wide one, in pandas. There are plenty of examples going from wide to long, but I did not find one from long to wide. I am trying to reformat my dataframe and pivot, groupby, unstack are a bit confusing for my use case.

This is how I want it to be. The numbers are actually the intensity column from the second image. enter image description here

And this is how it is now

enter image description here

I tried to build a MultiIndex based on Peptide, Charge and Protein. Then I tried to pivot based on that multi index, and keep all the samples and their intensity as values:

df.set_index(['Peptide', 'Charge', 'Protein'], append=False)
df.pivot(index=df.index, columns='Sample', values='Intensity')

Of course, this does not work since my index is now a combination of the 3 and not an actual column in the dataframe.

It tells me

KeyError: None of [RangeIndex(start=0, stop=3397898, step=1)] are in the [columns]

I tried also to group by, but I am not sure how to move from the long format back to wide. I am quite new to the dataframe way of thinking and I want to learn how to do this right. It was very tempting for me to do an old school "java"-like approach with 4 for loops and building it as a matrix. Thank you in advnace!

Upvotes: 2

Views: 5402

Answers (1)

Rick M
Rick M

Reputation: 1012

I think based on your attempt that this might work:

df2 = df.pivot(['Peptide', 'Charge', 'Protein'], columns='Sample', values='Intensity').reset_index()

After that, if you want to remove the name from the column axis:

df2 = df2.rename_axis(None, axis=1)

Upvotes: 4

Related Questions