user866364
user866364

Reputation:

Merge two dataframe and add new column based on merge

I have two dataframe that are sources based on my database and an API response:

database_data=[
    {
        "id": 123,
        "user_id": 555,
        "_deleted": True
    },
    {
        "id": 456,
        "user_id": 555,
        "_deleted": True
    }]


api_data = [
    {
        "id": 123,
        "user_id": 555,
        "_deleted": True
    },
    {
        "id": 789,
        "user_id": 555,
        "_deleted": True
    }
]

Based on those data, I would like perform a merge between API and database and apply a rule:

I've started to write some code to perform merge but I don't know how to add this conditional column.

df_db = pd.DataFrame.from_dict(database_data)
df_api = pd.DataFrame.from_dict(api_data)
# df_api.merge(df_db ... ??)

Any advice to solve it?

Upvotes: 2

Views: 775

Answers (2)

cs95
cs95

Reputation: 402263

Use merge with indicator=True, you can then compare the indicator output to get your column.

res = df_api.merge(df_db, how='left', indicator='indicator')
res['_deleted'] = res.pop('indicator') != "both"

   _deleted   id  user_id
0     False  123      555
1      True  789      555

Upvotes: 2

BENY
BENY

Reputation: 323226

I will using merge with indicator=True

df_db=df_api.drop('_deleted',1).merge(df_db.drop('_deleted',1) , on = ['id','user_id'],indicator=True,how='right')
df_db['_deleted']=df_db['_merge'].map({'both':False,'right_only':True})

df_db
Out[135]: 
    id  user_id      _merge  _deleted
0  123      555        both     False
1  456      555  right_only      True

Upvotes: 2

Related Questions