Arun Menon
Arun Menon

Reputation: 181

How to modify columns of a dataframe into values

I have two dataframe df1 and df2, where df1 has columns p1,p2 and p3. While dataframe df2 has values assigned for p1, p2 and p3. I would like to create a new dataframe (newdf1) where values in df1 becomes columns and p1,p2 and p3 are new values based on dataframe df2..

Query for df1 and df2 are

df1 = pd.DataFrame({"user": ["user1", "user2", "user3"], "p1": ["A", "C", "B"],"p2": ["C", "E", "A"],"p3": ["D", "A", "C"],}, index=[0, 1, 2], )
df2 = pd.DataFrame({"N1": ["p1", "p2", "p3"],"N2": ["2", "3", "4"], }, index=[0, 1, 2], )

My df1 looks like this enter image description here

And df2 look like this

enter image description here

While my desired newdf1 should look like this

enter image description here

(yellow highlight is to show transformation of p1 in the new dataframe)

Upvotes: 2

Views: 73

Answers (2)

Shubham Sharma
Shubham Sharma

Reputation: 71689

IIUC, you can use:

m = df1.melt('user')
m['variable'] = m['variable'].map(df2.set_index('N1')['N2'])
m.pivot('user', 'value', 'variable').fillna(0)

Details:

melt the dataframe on user

print(m)
    user variable value
0  user1       p1     A
1  user2       p1     C
2  user3       p1     B
3  user1       p2     C
4  user2       p2     E
5  user3       p2     A
6  user1       p3     D
7  user2       p3     A
8  user3       p3     B

map the variable column with corresponding values from df2:

print(m)

    user variable value
0  user1        2     A
1  user2        2     C
2  user3        2     B
3  user1        3     C
4  user2        3     E
5  user3        3     A
6  user1        4     D
7  user2        4     A
8  user3        4     C

pivot the above melted dataframe to reshape:

value  A  B  C  D  E
user                
user1  2  0  3  4  0
user2  4  0  2  0  3
user3  3  2  0  0  0

Upvotes: 3

ALollz
ALollz

Reputation: 59549

stack so that you can merge the second DataFrame to bring over the values and then pivot to your desired output.

(df1.set_index('user').stack().reset_index().rename(columns={'level_1': 'N1'})
    .merge(df2, how='left')
    .pivot(index='user', columns=0, values='N2')
    .rename_axis(columns=None)
    .fillna(0))

       A  B  C  D  E
user                
user1  2  0  3  4  0
user2  4  0  2  0  3
user3  3  2  4  0  0

Upvotes: 2

Related Questions