Geetha Anand
Geetha Anand

Reputation: 73

Getting error while adding column of a dataframe in pandas

I have two Dataframes, G1 and G2:

G1:

studentId midterm homework final
101       45       69       89  
102       20       22       11
103       11       11       11 

G2:

studentId midterm homework final
101       55       69       89  
102       20       65       11
103       11       11       11

Trying to get the output as below:

student           Old  New
101     Midterm    45   55
102     Homework   22   65   

Upvotes: 2

Views: 337

Answers (2)

anky
anky

Reputation: 75080

Here is an alternate way using merge and pd.wide_to_long:

m=df1.merge(df2,on='studentId',suffixes=('_old','_new'))
n=(pd.wide_to_long(m,df1.columns[1:],'studentId','type',sep='_',suffix='\w+')
                                                         .unstack().stack(0))
n.query("new!=old")

           type    new old
studentId           
101       midterm   55  45
102       homework  65  22

Upvotes: 2

jezrael
jezrael

Reputation: 862511

Create index by columns studentId for both DataFrames, also is possible add parameter keys to concat for new columns names:

G1 = G1.set_index('studentId')
G2 = G2.set_index('studentId')
GBool = (G1!= G2).stack() 
Gdiff = pd.concat([G1.stack()[GBool],G2.stack() [GBool]], axis=1, keys=('Old','New'))
print(Gdiff) 
                    Old  New
studentId                   
101       midterm    45   55
102       homework   22   65

If is necessary convert MultiIndex to columns:

Gdiff = (pd.concat([G1.stack()[GBool],
                   G2.stack()[GBool]], 
                   axis=1, 
                   keys=('Old','New')).rename_axis(('studentId','type')).reset_index())
print(Gdiff) 
   studentId      type  Old  New
0        101   midterm   45   55
1        102  homework   22   65

Upvotes: 2

Related Questions