MBI
MBI

Reputation: 31

Manipulate Dataframe

Lets say I'm working on a dataset: # dummy dataset

import pandas as pd
data = pd.DataFrame({"Name_id" : ["John","Deep","Julia","John","Sandy",'Deep'], 
                     "Month_id" : ["December","March","May","April","May","July"],
                    "Colour_id" : ["Red",'Purple','Green','Black','Yellow','Orange']})
data

enter image description here

How can I convert this data frame into something like this:

enter image description here

Where the A_id is unique and forms new columns based on both the value and the existence / non-existence of the other columns in order of appearance? I have tried to use pivot but I noticed it's more used for numerical data instead of categorical.

Upvotes: 1

Views: 44

Answers (1)

ThomasIsCoding
ThomasIsCoding

Reputation: 102349

Probably you should try pivot

data['Rowid'] = data.groupby('Name_id').cumcount()+1
d = data.pivot(index='Name_id', columns='Rowid',values = ['Month_id','Colour_id'])
d.reset_index(inplace=True)
d.columns = ['Name_id','Month_id1', 'Colour_id1', 'Month_id2', 'Colour_id2']

which gives

  Name_id Month_id1 Colour_id1 Month_id2 Colour_id2
0    Deep     March       July    Purple     Orange
1    John  December      April       Red      Black
2   Julia       May        NaN     Green        NaN
3   Sandy       May        NaN    Yellow        NaN

Upvotes: 2

Related Questions