cd-6
cd-6

Reputation: 190

Check duplicate values in some of columns in dataframe

I have a dataframe which is organized similar to this

enter image description here

I am trying to make a new column which details if the GUID matches and the 'DL', 'DRI', 'DS' columns match. So from the image above the new column would have picked up that this data is a match.

I've tried:

cols = {['DL','DRI','DS']}
df['match'] = df[cols].eq(df.col1.shift())

But am getting 'TypeError: unhashable type: 'list''

Upvotes: 1

Views: 46

Answers (2)

luigigi
luigigi

Reputation: 4215

Maybe this is what you want. Maybe you should rephrase your question because it isn't clear what you actually want.

df = pd.DataFrame({'GUID':['00059','00059','123'], 'DL':['','','123'], 'DRI':[True,True,True], 'DS':['','','123'], 'Model':['asd','qwe','123']})
df

    GUID   DL   DRI   DS Model
0  00059       True        asd
1  00059       True        qwe
2    123  123  True  123   123

df['match'] = df.duplicated(['GUID', 'DL', 'DRI', 'DS'], keep=False)
df

    GUID   DL   DRI   DS Model  match
0  00059       True        asd   True
1  00059       True        qwe   True
2    123  123  True  123   123  False

Upvotes: 1

oppressionslayer
oppressionslayer

Reputation: 7224

cols = {['DL','DRI','DS']}

is in the wrong format, it should be:

cols = [['DL','DRI','DS']] 

Upvotes: 1

Related Questions