BoB
BoB

Reputation: 159

How do I get a value from one dataframe meeting certain conditions from another dataframe?

I have two dataframes:

df1 with columns: id | date

and

df2 with columns: id| begin_date | end_date | var

I need to add var column to df1 for conditions:

Expectet result: df1 with columns: id | date | var.

Thanks for advices.

Upvotes: 0

Views: 65

Answers (2)

Danny
Danny

Reputation: 472

You can this do in a single step but your data types need to be consistent before you start comparing.

df1['var'] = df1.apply(lambda x: df2.loc[max(df2[(x['id'] == df2['id']) & (df2['begin_date'] == x['date']) & (x['date'] < df2['end_date'])].index), 'var'], axis = 1)

Upvotes: 1

Rajith Thennakoon
Rajith Thennakoon

Reputation: 4130

as far as i understand your requirement,i think you can do it by two steps.

1.first - merge the two dataframes by id

merged_data_frame = pd.merge(df1,df2,how="inner",on=["id"],indicator=True)

2.second step

  • convert date and begin_date,end_date types to datetime
  • filter data by your date condition

    merged_data_frame["date"] = pd.to_datetime(merged_data_frame["date"])
    merged_data_frame["begin_date"] = pd.to_datetime(merged_data_frame["begin_date"])
    merged_data_frame["end_date"] = pd.to_datetime(merged_data_frame["end_date"])
    merged_data_frame = merged_data_frame[merged_data_frame.begin_date <= merged_data_frame.date]
    merged_data_frame = merged_data_frame[merged_data_frame.date >  merged_data_frame.end_date]
    out=merged_data_frame.drop_duplicates(subset=["id"], keep='last')
    

Upvotes: 1

Related Questions