Fill new column in one dataframe with values from another, based on values in two other columns? (Python/Pandas)

Question

I need to add a column to a dataframe and fill it with values from another dataframe, but I do not have a unique ID or key or index that is shared between them. They do have two identifiers in common that make each row unique between them, and I want to try and match on both those columns.

Here is an example...

Student ID  Test Day        Student ID  Test Day    Grade
E0304003    Quiz 1      E0304003    Quiz 1  A
E3305002    Quiz 1      E0303003    Test 2  A
E0303002    Test 1      E0304005    Quiz 2  D
E0304004    Quiz 1      E3306001    Test 1  C
E0301001    Quiz 1      E0301001    Quiz 1  A
E0307002    Test 2      E0307002    Test 2  C
E0303003    Test 2      E0308002    Quiz 2  A
E0304005    Quiz 2      E0301002    Test 2  C
E3306001    Test 1      E3305002    Quiz 1  C
E0308002    Quiz 2      E0303002    Test 1  F
E0301002    Test 2      E0304004    Quiz 1  C

I want to create a third column in the first set that is also Grade, and contains the same values for rows that match the student ID and Test Day in the other dataframe.

You can see that the column order in each set doesn't match, and in the real data, there will be missing values, so I need this to be dynamic; it's not as simple as just sorting and then matching.

Can I do something like:

For row in df_one:
        if df_one["Student ID"] == df_two["Student ID"] and df_one["Test Day"] == df_two["Day"]:
                df_one["Grade"] = df_two["Grade"]
        else:
                df_one["Grade"] = "Unknown"

jezrael · Accepted Answer

Use merge with left join and if some values not match get NaNs, which are replaced by fillna:

df = pd.merge(df_one, df_two, on=['Student ID','Test Day'], how='left')
df['Grade'] = df['Grade'].fillna('Unknown')
print (df)

   Student ID Test Day Grade
0    E0304003   Quiz 1     A
1    E3305002   Quiz 1     C
2    E0303002   Test 1     F
3    E0304004   Quiz 1     C
4    E0301001   Quiz 1     A
5    E0307002   Test 2     C
6    E0303003   Test 2     A
7    E0304005   Quiz 2     D
8    E3306001   Test 1     C
9    E0308002   Quiz 2     A
10   E0301002   Test 2     C

Fill new column in one dataframe with values from another, based on values in two other columns? (Python/Pandas)

Answers (1)

Related Questions