Reputation: 51
I have two dataframe named emp_crm and verient. If below conditions are matched append emp_crm dataframe values to verient dataframe.
Problem statement:
I want to fetch records from emp_crm
and append to verient
if emp_crm[ACD]
matches Verient[PbxId]
AND Date of AudioStart
equals to Date of OPEN_DT
AND Time of OPEN_DT
is less than Time of AudioStart
AND greater than Time of AudioEnd
[condition: AudioStartTime < Ticket OpenTime < Audio EndTime]
then for matching emp_crm[ACD]
with Verient[PbxId]
append the emp_crm's [all columns]
to Verient[all columns].
Pointed to be noted:
No records should get deleted from verient and append only for matching records and leave other records blank.
My Approach -
emp_crm[ACD] == Verient[PbxId]
AND verient[AudioStartDate] == emp_crm[TicketOpenDate]
AND StartTime_in_Seconds < OpenTime_in_Seconds < EndTime_in_Seconds
then emp_crm[ACD] == Verient[PbxId]
, DataFrame: emp_crm
ACD NID OPEN_DT OPEN_ID TICKET_NO TIER_1 OpenDate OpenTime OpenTime_in_Seconds
6557 n199 2021-05-19 20:13:00 n199 INC000047814 Voice 2021-05-19 20:13:00 72780
7768 n141 2021-04-30 19:39:00 n141 INC000047860 Data 2021-04-30 19:39:00 70740
7768 n141 2021-04-30 15:56:30 n141 INC000047853 Data 2021-04-30 15:56:30 57390
6570 n116 2021-05-19 16:00:27 n116 INC000047848 Ring 2021-05-24 16:10:00 58200
6570 n116 2021-05-19 15:55:02 n116 INC000047784 Voice 2021-05-19 15:55:02 57302
DataFrame: verient
AudioEnd AudioStart PbxId AudioStartDate StartTime AudioEndDate EndTime StartTime_in_Seconds EndTime_in_Seconds
2021-04-30 16:07:24 2021-04-30 15:35:27 6576 2021-04-30 15:35:27 2021-04-30 16:07:24 56127 58044
2021-04-30 15:51:12 2021-04-30 15:46:30 7768 2021-04-30 15:46:30 2021-04-30 15:51:12 56790 57072
2021-04-30 16:16:06 2021-04-30 15:48:22 7768 2021-04-30 15:48:22 2021-04-30 16:16:06 56902 58566
2021-05-19 16:00:47 2021-05-19 15:50:02 6570 2021-05-19 15:50:02 2021-05-19 16:00:47 57002 57647
2021-05-19 16:02:20 2021-05-19 15:53:26 6570 2021-05-19 15:53:26 2021-05-19 16:02:20 57206 57740
Final DataFrame: verient_crm
AudioEnd AudioStart PbxId ACD NID OPEN_DT OPEN_ID TICKET_NO TIER_1
2021-04-30 16:07:24 2021-04-30 15:35:27 6576
2021-04-30 15:51:12 2021-04-30 15:46:30 7768
2021-04-30 16:16:06 2021-04-30 15:48:22 7768 7768 n141 2021-04-30 15:56:30 n141 INC000047853 Data
2021-05-19 16:00:47 2021-05-19 15:50:02 6570 6570 n116 2021-05-19 16:00:27 n116 INC000047848 Ring
2021-05-19 16:02:20 2021-05-19 15:53:26 6570 6570 n116 2021-05-19 15:55:02 n116 INC000047784 Voice
Final Result after dropping unnecessary columns:
------------------------------------------------
verient_crm(AudioEnd,AudioStart,PbxId,ACD, NID,OPEN_DT, OPEN_ID, TICKET_NO, TIER_1)
import pandas as pd
import numpy as np
matchedDict = {}
for record in emp_crm.iterrows():
for vr in verient.iterrows():
if(record == vr): ## verient.pbxLoginID == emp_CRM.ACD
if(vr.AudioStart_in_TimeStamp <= record.OPEN_DT_in_TimeStamp and record.OPEN_DT_in_TimeStamp <= vr.AudioEnd_in_TimeStamp):
matchedDict[vr.index] = record
# Append emp_crm(ACD, NID,OPEN_DT, OPEN_ID, TICKET_NO, TIER_1) with Verient for every matched ACD == PbxID from emp_crm to Verient.
for item in matchedDict:
record = item.value
verient.iloc[item.key,col_index_for_ACD] = record['ACD'] ## Don't know how to fetch col_index_for_tier1
verient.iloc[item.key,col_index_for_NID] = record['NID']
verient.iloc[item.key,col_index_for_OPEN_DT] = record['OPEN_DT']
verient.iloc[item.key,col_index_for_OPEN_ID] = record['OPEN_ID']
verient.iloc[item.key,col_index_for_TICKET_NO] = record['TICKET_NO']
verient.iloc[item.key,col_index_for_TIER_1] = record['TIER_1']
I am getting the below error and not able to solve this.
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-74-bf0a6bb075ee> in <module>
4 for record in emp_crm.iterrows():
5 for vr in verient.iterrows():
----> 6 if(record['ACD'] == vr['pbxLoginID']): ## verient.pbxLoginID == emp_CRM.ACD or not
7 if(vr.AudioStart_in_TimeStamp <= record.OPEN_DT_in_TimeStamp and record.OPEN_DT_in_TimeStamp <= vr.AudioEnd_in_TimeStamp):
8 matchedDict[vr.index] = record
TypeError: tuple indices must be integers or slices, not str
Pointed to be noted:
Upvotes: 0
Views: 94
Reputation: 2452
Instead of iterating over the dataframes (which is a very costly operation), you can merge the dataframes and then filter the resultant dataframe as follows-
# Left merge
verient = pd.merge(verient, emp_crm, left_on = ['PbxId'], right_on = ['ACD'], how='left')
# Filter
verient= verient.loc[(verient.AudioStart_in_TimeStamp < verient.OPEN_DT_in_TimeStamp) & (verient.OPEN_DT_in_TimeStamp < verient.AudioEnd_in_TimeStamp)]
You can then drop unnecessary columns if needed.
Edit - To keep original rows from verient
dataframe, you can assign NULL
values to emp_crm
columns wherever the conditions don't match. Instead of filter
, do this-
verient.loc[~((verient.AudioStart_in_TimeStamp < verient.OPEN_DT_in_TimeStamp) & (verient.OPEN_DT_in_TimeStamp < verient.AudioEnd_in_TimeStamp)), ['ACD', 'NID', 'OPEN_DT', 'OPEN_ID', 'TICKET_NO', 'TIER_1']] = np.nan
Upvotes: 1