Conditionally fill a column of a pandas df with values of a different df

Question

I have two differently shaped df. One contains words and their frequencies, the other contains words and their lemmas.

The first df maps always one word to one frequency, the second df maps many words to one lemma (multiple times). E.g.:

df1:

  word  frequency
    de   33504559
   que   32700217
    no   28263302
     a   21978600
    la   21249418

and df2:

     lemma       word
   zurullo   zurullos
  zurupeto  zurupetos
    zutano     zutana
    zutano    zutanas
    zutano    zutanos

I would like to add the lemma information to df1, by searching each word of df1, comparing it to the words in df2, and pulling the lemma information from df2 to add it back to df1.

There are useful answers for when the value is always the same in df1, but since I want to do this for each row that each contains a different word, I am not sure how to proceed. (I checked the merging and concatenating docs section but resurfaced more confused than before...)

In just-python I would use loops, e.g.:

new_df = dict()
# assuming all dfs are dicts
for w, f in df1.items():
    if w in df2.keys():
        new_df[w] = (df2[w], f)

Would be happy to learn more about this using pandas dataframe operations.

Zeugma · Accepted Answer

Try this:

df1.merge(df2, how='left', on='word')

Conditionally fill a column of a pandas df with values of a different df

Answers (1)

Related Questions