yuchen huang
yuchen huang

Reputation: 257

How to classify one column's value by other dataframe?

I am trying to classify one data based on a dataframe of standard. The standard like df1, and I want to classify df2 based on df1.

df1:
PAUCode     SubClass
1           RA
2           RB
3           CZ

df2:
PAUCode     SubClass
2           non      
2           non
2           non
3           non
1           non
2           non
3           non

I want to get the df2 like as below:

expected result:

PAUCode     SubClass
2           RB      
2           RB
2           RB
3           CZ
1           RA
2           RB
3           CZ

Upvotes: 1

Views: 256

Answers (2)

BENY
BENY

Reputation: 323306

Let us using reindex

df1.set_index('PAUCode').reindex(df2.PAUCode).reset_index()
Out[9]: 
   PAUCode SubClass
0        2       RB
1        2       RB
2        2       RB
3        3       CZ
4        1       RA
5        2       RB
6        3       CZ

Upvotes: 2

cs95
cs95

Reputation: 402603

Option 1
fillna

df2 = df2.replace('non', np.nan)

df2.set_index('PAUCode').SubClass\
       .fillna(df1.set_index('PAUCode').SubClass)

PAUCode
2    RB
2    RB
2    RB
3    CZ
1    RA
2    RB
3    CZ
Name: SubClass, dtype: object

Option 2
map

df2.PAUCode.map(df1.set_index('PAUCode').SubClass)

0    RB
1    RB
2    RB
3    CZ
4    RA
5    RB
6    CZ
Name: PAUCode, dtype: object

Option 3
merge

df2[['PAUCode']].merge(df1, on='PAUCode')

   PAUCode SubClass
0        2       RB
1        2       RB
2        2       RB
3        2       RB
4        3       CZ
5        3       CZ
6        1       RA

Note here the order of the data changes, but the answer remains the same.

Upvotes: 4

Related Questions