Changing values in a column based on a match

Question

I have a Pandas DataFrame which contains names of brazilians universities, but somethings I have these names in a short way or in a long way (for example, the Universidade Federal do Rio de Janeiro sometimes is identified as UFRJ). The DataFrame look like this:

| college                                |
|----------------------------------------|
| Universidade Federal do Rio de Janeiro |
| UFRJ                                   |
| Universidade de Sao Paulo              |
| USP                                    |
| Catholic University of Minas Gerais    |

And I have another one which has in separate columns the short name and the long name of SOME (not all) of those universities. Which looks likes this:

| long_name                              | short_name |
|----------------------------------------|------------|
| Universidade Federal do Rio de Janeiro | UFRJ       |
| Universidade de Sao Paulo              | USP        |

What I want is: substitute all short names by long names, so in this context, the first dataframe would have the college column changed to this:

| college                                |
|----------------------------------------|
| Universidade Federal do Rio de Janeiro |
| Universidade Federal do Rio de Janeiro |
| Universidade de Sao Paulo              |
| Universidade de Sao Paulo              |
| Catholic University of Minas Gerais    | <--- note: this one does not have a match, so it stays the same

Is there a way to do that using pandas and numpy (or any other library)?

jezrael · Accepted Answer

Use Series.map with replace by second DataFrame, if no match get missing values, so added Series.fillna:

df1['college'] = (df1['college'].map(df2.set_index('short_name')['long_name'])
                                .fillna(df1['college']))

print (df1)
                                  college
0  Universidade Federal do Rio de Janeiro
1  Universidade Federal do Rio de Janeiro
2               Universidade de Sao Paulo
3               Universidade de Sao Paulo
4     Catholic University of Minas Gerais

Changing values in a column based on a match

Answers (1)

Related Questions