Daniel
Daniel

Reputation: 49

Transform table following ID using pandas

Purchase_ID Item Date
1 A 21/3/2020
1 B 21/3/2020
1 C 21/3/2020
2 X 27/3/2020
2 Y 27/3/2020

I would like to transform the table into the output below using pandas/any python modules

Purchase_ID Item Date
1 A,B,C 21/3/2020
2 X,Y 27/3/2020

Upvotes: 1

Views: 42

Answers (2)

BLimitless
BLimitless

Reputation: 2605

The following works:

df2 = df.pivot_table(index='Purchase_ID', values=['Item','Date'], aggfunc=set)

Note: you'll get a set, rather than list, for the Items and Dates. This avoids duplicating the dates.

If you want to get rid of the sets, you can convert to strings:

df2['Date'] = df2['Date'].apply(lambda x: list(x)[0])
df2['Item'] = df2['Item'].apply(lambda x: ','.join(list(x))

Then df2 is the DataFrame you're looking for.

out:
               Date       Item
Purchase_ID                      
1            21/3/2020    A,B,C
2            27/3/2020    X,Y

Upvotes: 1

Anurag Dabas
Anurag Dabas

Reputation: 24324

try via groupby() and agg():

out=df.groupby('Purchase_ID',as_index=False).agg({'Item':','.join,'Date':'first'})

If needed unique elements then use:

out=df.groupby('Purchase_ID',as_index=False).agg({'Item':lambda x:','.join(set(x)),'Date':'first'})

output of out:

    Purchase_ID     Item    Date
0   1               A,C,B   21/3/2020
1   2               Y,X     27/3/2020

Upvotes: 1

Related Questions