Reputation: 91
I am importing via pyodbc 2 dataframes: df1 and df2.
They are big and I need to compare them.
PS: They don't have the same size.
3 columns size Dataframes:
What I want to do:
If CODUSU columns are equal then df2[Situação] = 'K'
What I have done, but it's too slow:
for i in range(0,len(df2)):
for k in range(0,len(df1)):
if df2.loc[i][0] == df1.loc[k][0]:
df2[i]["Situação"] = "K"
I also need to:
If in item in df1[CODUSU] and not in df2[CODUSU] then df2[Situação] = 'Q'
If in item in df2[CODUSU] and not in df1[CODUSU] then df2[Situação] = 'B'
Upvotes: 0
Views: 155
Reputation: 149175
You should do an outer merge on the dataframes asking for the indicator variable:
resul = df2.merge(df1, how='outer', on='A', suffixes=('', '_y'),indicator=True)
Do not forget the DTDOSE
column for rows coming from df1
only:
resul.loc[resul['indicator'] == 'right_only', 'DTDOSE'] = resul.loc[
resul['indicator'] == 'right_only', 'DTDOSE_y']
Time to compute the new values for the Situação
column:
resul.loc[resul['indicator'] == 'both', 'Situação'] = 'K'
resul.loc[resul['indicator'] == 'right_only', 'Situação'] = 'Q'
resul.loc[resul['indicator'] == 'left_only', 'Situação'] = 'B'
And finally get rid of the auxiliary columns:
resul = resul[['CODUSU', 'DTDOSE', 'Situação']
Upvotes: 1