dark horse
dark horse

Reputation: 447

Pandas - Performing left join across 2 Dataframe in Pandas

I have two dataframes (one that has list of all days in a month and the other than has days when a staff marked attendance) and I am trying to perform a left join such that I have a new Dataframe with all dates and with dates when employee did and did not mark attendance.

Given below is how df1 is:

days
01-01-2018
02-01-2018
03-01-2018
04-01-2018
05-01-2018
06-01-2018
07-01-2018

Given below is how df2 is:

date, emp_id
01-01-2018,101
03-01-2018,101
04-01-2018,101
06-01-2018,101

I am trying to create a new Dataframe as below:

date,marked,emp_id
01-01-2018,01-01-2018,101
02-01-2018,02-01-2018,101
03-01-2018,03-01-2018,101
04-01-2018,04-01-2018,101
05-01-2018,05-01-2018,101
06-01-2018,06-01-2018,101

Days when a value exists in df2, the new Dataframe shall have a valid date if the date exists in df1 and df2 else it should be null. I tried doing the below but I see it returns all dates

new_df = pd.merge(df1, df2,  how='left', left_on=['days'], right_on = ['date'])

Upvotes: 0

Views: 88

Answers (1)

Dani Mesejo
Dani Mesejo

Reputation: 61910

You could do something like this:

new_df = pd.merge(df1, df2,  how='outer', left_on=['days'], right_on = ['date'])
new_df = new_df.fillna({'emp_id': 101.0})
print(new_df)

Output

        days       date  emp_id
0 2018-01-01 2018-01-01   101.0
1 2018-01-02        NaT   101.0
2 2018-01-03 2018-01-03   101.0
3 2018-01-04 2018-01-04   101.0
4 2018-01-05        NaT   101.0
5 2018-01-06 2018-01-06   101.0
6 2018-01-07        NaT   101.0

If you want a sort of indicator column, do this, instead:

new_df = pd.merge(df1, df2,  how='outer', left_on=['days'], right_on = ['date']).fillna({'emp_id': 101.0})
new_df['marked'] = (new_df.days == new_df.date).astype(np.uint8)
new_df = new_df.drop('date', axis=1)
print(new_df)

Output

        days  emp_id  marked
0 2018-01-01   101.0       1
1 2018-01-02   101.0       0
2 2018-01-03   101.0       1
3 2018-01-04   101.0       1
4 2018-01-05   101.0       0
5 2018-01-06   101.0       1
6 2018-01-07   101.0       0

Upvotes: 1

Related Questions