CutePoison
CutePoison

Reputation: 5355

Pandas join on column based on value in another column

I have read several other SO questions with (almost) the same problem, altho this is a bit different.

I have a dataframe

#df_animals

animal | weight | id
-------+--------+----
"cow"     50       1
"cow"     45       2
"pig"     30       1
"pig"     25       2

and another dataframe

"df_cow_price

id | price
---+------
1    100
2    70

and I want to join df_cow_price onto df_animals joining on id but only where animal=="cow".

Right now my work-around is by filtering on animal=="cow", remove that filtered data from the dataframe, join on the filtered dataframe and concat the filtered and unfiltered dataframe.

But isn't there a way to do that in "one go" i.e something like this pseudo-code

df_total = pd.merge(df_animals,df_cow_price, left_on="id", right_on="id", filter_left = df_animals["animal"]=="cow")

animal | weight | id | price
-------+--------+----+------
"cow"     50       1   100
"cow"     45       2   70
"pig"     30       1   None
"pig"     25       2   None

Upvotes: 0

Views: 1773

Answers (1)

jezrael
jezrael

Reputation: 862541

This is not possible like need in merge, need postprocessing or postprocessing like:

df_total = pd.merge(df_animals, df_cow_price, on="id", how='left')
df_total.loc[df_total["animal"]!="cow", 'price'] = np.nan

Or:

df_total1=pd.merge(df_animals[df_animals["animal"]=="cow"],df_cow_price, on="id",how='left')

df_total2=df_animals[df_animals["animal"]!="cow"]

df_total = pd.concat([df_total1, df_total2], ignore_index=True)

Upvotes: 1

Related Questions