Check duplicate values in some of columns in dataframe

Question

I have a dataframe which is organized similar to this

I am trying to make a new column which details if the GUID matches and the 'DL', 'DRI', 'DS' columns match. So from the image above the new column would have picked up that this data is a match.

I've tried:

cols = {['DL','DRI','DS']}
df['match'] = df[cols].eq(df.col1.shift())

But am getting 'TypeError: unhashable type: 'list''

luigigi · Accepted Answer

Maybe this is what you want. Maybe you should rephrase your question because it isn't clear what you actually want.

df = pd.DataFrame({'GUID':['00059','00059','123'], 'DL':['','','123'], 'DRI':[True,True,True], 'DS':['','','123'], 'Model':['asd','qwe','123']})
df

    GUID   DL   DRI   DS Model
0  00059       True        asd
1  00059       True        qwe
2    123  123  True  123   123

df['match'] = df.duplicated(['GUID', 'DL', 'DRI', 'DS'], keep=False)
df

    GUID   DL   DRI   DS Model  match
0  00059       True        asd   True
1  00059       True        qwe   True
2    123  123  True  123   123  False

Check duplicate values in some of columns in dataframe

Answers (2)

Related Questions