How to create a crosstab with concatenation on pandas dataframe?

I have a pandas dataframe log

  order  row   column     
  1      3     B   
  2      6     U        
  3      3     U       
  4      7     C
  5      6     B

I want to create a dataframe where each row corresponds to a number from row, and the sequence value is created by concatenating the values from column in the order from order:

        sequence
  3     BU
  6     UB
  7     C

Is there a (fast) way to do that?

Upvotes: 1

Views: 160

Answers (2)

Erfan
Erfan

Reputation: 42916

First sort_values by order, then groupby on row and make sure you use sort=False. Then finally we use GroupBy.agg and join the strings:

dfg = (
    df.sort_values("order")
    .groupby("row", sort=False)["column"].agg("".join)
    .reset_index(name="sequence")
)
   row sequence
0    3       BU
1    6       UB
2    7        C

Upvotes: 2

Michael Szczesny
Michael Szczesny

Reputation: 5036

This does the job

df.groupby('row')['column'].apply(lambda x: ''.join(list(x)))

Output

3    BU
6    UB
7     C

Upvotes: 2

Related Questions