Reputation: 93
I have a dataframe with four columns: ID, Phone1, Phone2, and Phone3. I would like to create a new dataframe with three columns: ID, Phone, PhoneSource. If I do an append as in this question:
df['Column 1'].append(df['Column 2']).reset_index(drop=True)
I obtain half of what I want: all the Phone numbers are in the same column. But how do I keep the source?
Upvotes: 2
Views: 2248
Reputation: 863751
I think you can use melt
:
df = pd.DataFrame({'ID':[2,3,4,5],
'Phone 1':['A', 'B', 'C', 'D'],
'Phone 2':['E', 'F', 'G', 'H'],
'Phone 3':['A', 'C', 'G', 'H']})
print (df)
ID Phone 1 Phone 2 Phone 3
0 2 A E A
1 3 B F C
2 4 C G G
3 5 D H H
print (pd.melt(df, id_vars='ID', var_name='PhoneSource', value_name='Phone'))
ID PhoneSource Phone
0 2 Phone 1 A
1 3 Phone 1 B
2 4 Phone 1 C
3 5 Phone 1 D
4 2 Phone 2 E
5 3 Phone 2 F
6 4 Phone 2 G
7 5 Phone 2 H
8 2 Phone 3 A
9 3 Phone 3 C
10 4 Phone 3 G
11 5 Phone 3 H
Another solution with stack
:
df1 = df.set_index('ID').stack().reset_index()
df1.columns = ['ID','PhoneSource','Phone']
print (df1)
ID PhoneSource Phone
0 2 Phone 1 A
1 2 Phone 2 E
2 2 Phone 3 A
3 3 Phone 1 B
4 3 Phone 2 F
5 3 Phone 3 C
6 4 Phone 1 C
7 4 Phone 2 G
8 4 Phone 3 G
9 5 Phone 1 D
10 5 Phone 2 H
11 5 Phone 3 H
Upvotes: 3