Mathurin Elsen
Mathurin Elsen

Reputation: 11

Concat rows in dataframe

I have a dataframe looking like this :

DATA | 0 | 1 | 2 | 3 
XX1  |   | 0A | 0B | 
XX2  |   | 0C |    |
XX3  |   | 1A | 0A | 1C

I would like to transform it that way :

id | DATA 
0A  | XX1 
0A  | XX3
0B  | XX1
0C  | XX2 
1A  | XX3
1C  | XX3

Every id must appear each time it's appearing in the dataframe, that's the reason why there is 2 "0A"

I tried this so far, but I'm getting nowhere.

la = []    
for b in range(2,10):
        la.append(df.groupby(by=[b])[b,"DATA"].agg(np.sum).rename(columns={b:"id"}))
    
df = pd.concat(la, ignore_index=True)

Anyone got a suggestion?

Upvotes: 0

Views: 43

Answers (1)

Erfan
Erfan

Reputation: 42946

You can use DataFrame.melt for this:

(
    df.melt(id_vars="DATA", value_name="id")
    .dropna()
    .sort_values("id", ignore_index=True)
    .drop(columns=["variable"])
)
  DATA  id
0  XX1  0A
1  XX3  0A
2  XX1  0B
3  XX2  0C
4  XX3  1A
5  XX3  1C

Upvotes: 1

Related Questions