lwu29
lwu29

Reputation: 111

Transform a dataframe in python

pd.DataFrame({'col1': [1,1,1,1,1,1, 2,2,2,2,2,2], 'col2': ['in', 'out','in', 'out','in', 'out','in', 'out','in', 'out','in', 'out'], 'col3':['A','B','C','D','E','F','G','H','I','J','K','L']})

I'm looking for an efficient way to transform the above dataframe to a table like this:

| Col1 | Col2 | Col3      |
| -----| ---- | ---- ---- |
| 1    | in   |'A','C','E'|
| 1    | out  |'B','D','F'|
| 2    | in   |'G','I','K'|
| 2    | out  |'H','J','L'|

Upvotes: 0

Views: 68

Answers (2)

Pawan Jain
Pawan Jain

Reputation: 825

You can use apply like this

df.groupby(['col1','col2'])['col3'].apply(list).reset_index()

Output

   col1 col2   col3
0     1   in  [A,C,E]
1     1  out  [B,D,F]
2     2   in  [G,I,K]
3     2  out  [H,J,L]

Upvotes: 1

Nk03
Nk03

Reputation: 14949

You can use groupby:

result = df.groupby(['col1', 'col2'], as_index=False).agg({'col3': ','.join})

OUTPUT:

   col1 col2   col3
0     1   in  A,C,E
1     1  out  B,D,F
2     2   in  G,I,K
3     2  out  H,J,L

Upvotes: 4

Related Questions