JJJohn
JJJohn

Reputation: 1079

update values based on if the values of another column is in the column of another dataframe

Assume I'm teaching java to a class of students with a colleague teaching them math.

The class took tests a few days ago, here are the grades of java.

data = [['0110200026', 50], ['0110200057', 55], ['0110200079', 80]]
df = pd.DataFrame(data, columns = ['stu_no', 'grades'])
df

enter image description here

Sadly, two guys didn't get good grades in the java test, so they took a make-up test with another guy for math. Fortunately, all of them passed the make-up test.

data2 = [['0110200056', 65], ['0110200026', 75], ['0110200057', 65] ]
df2 = pd.DataFrame(data2, columns = ['stu_no', 'grades'])
df2

enter image description here

I'm trying to make a report that merges the new grade into the old grades, ignoring a guy ('0110200056') as he was for math.

So, here are the new records to be merged.

df2.loc[df2.stu_no.isin(df.stu_no)]

enter image description here

I tried 3 commands separately

df.loc[df.stu_no.isin(df2.stu_no)] = df2
df.update(df2)
df2.update(df)

none of them gives what I want

enter image description here

What do I do?

Upvotes: 1

Views: 50

Answers (1)

jezrael
jezrael

Reputation: 862581

In your solution match stu_no by convert values to index and then use update:

df = df.set_index('stu_no')
df2 = df2.set_index('stu_no')
df.update(df2)
df = df.reset_index()
print (df)
       stu_no  grades
0  0110200026    75.0
1  0110200057    65.0
2  0110200079    80.0

Or use Series.map for new values by stu_no from another DataFrame and missing values (if no match) replace by original column values:

df['grades'] = df['stu_no'].map(df2.set_index('stu_no')['grades']).fillna(df['grades'])
print (df)
       stu_no  grades
0  0110200026    75.0
1  0110200057    65.0
2  0110200079    80.0

Upvotes: 1

Related Questions