Pandas - Merge columns into one keeping the column name

Question

I have a dataframe with four columns: ID, Phone1, Phone2, and Phone3. I would like to create a new dataframe with three columns: ID, Phone, PhoneSource. If I do an append as in this question:

df['Column 1'].append(df['Column 2']).reset_index(drop=True)

I obtain half of what I want: all the Phone numbers are in the same column. But how do I keep the source?

jezrael · Accepted Answer

I think you can use melt:

df = pd.DataFrame({'ID':[2,3,4,5],
                   'Phone 1':['A', 'B', 'C', 'D'],
                   'Phone 2':['E', 'F', 'G', 'H'],
                   'Phone 3':['A', 'C', 'G', 'H']})
print (df)
   ID Phone 1 Phone 2 Phone 3
0   2       A       E       A
1   3       B       F       C
2   4       C       G       G
3   5       D       H       H

print (pd.melt(df, id_vars='ID', var_name='PhoneSource', value_name='Phone'))
    ID PhoneSource Phone
0    2     Phone 1     A
1    3     Phone 1     B
2    4     Phone 1     C
3    5     Phone 1     D
4    2     Phone 2     E
5    3     Phone 2     F
6    4     Phone 2     G
7    5     Phone 2     H
8    2     Phone 3     A
9    3     Phone 3     C
10   4     Phone 3     G
11   5     Phone 3     H

Another solution with stack:

df1 = df.set_index('ID').stack().reset_index()
df1.columns = ['ID','PhoneSource','Phone']
print (df1)
    ID PhoneSource Phone
0    2     Phone 1     A
1    2     Phone 2     E
2    2     Phone 3     A
3    3     Phone 1     B
4    3     Phone 2     F
5    3     Phone 3     C
6    4     Phone 1     C
7    4     Phone 2     G
8    4     Phone 3     G
9    5     Phone 1     D
10   5     Phone 2     H
11   5     Phone 3     H

Pandas - Merge columns into one keeping the column name

Answers (1)

Related Questions