mb567
mb567

Reputation: 681

Remove both duplicates in multiple lists python

I have three lists X, Y, Z as follows:

X: [1, 1, 2, 3, 4, 5, 5, 5]
Y: [3, 3, 2, 6, 7, 1, 1, 2]
Z: [0, 0, 1, 1, 2, 3, 3, 4]

I am trying to remove both duplicated set of values at the same index of the lists get a reduced list as follows, all three list will always have the same length initially and at the end as well:

X: [2, 3, 4, 5]
Y: [2, 6, 7, 2]
Z: [1, 1, 2, 4]

I tried using the zip(X, Y, Z) function but I can't index it and the dict.fromkeys only removes one of the duplicates and leaves the other in the new list. I want to be able to remove both.

Any help is appreciated!

Upvotes: 4

Views: 2497

Answers (4)

Laurent H.
Laurent H.

Reputation: 6526

Here is my solution without any import, but still short and easily readable:

X = [1, 1, 2, 3, 4, 5, 5, 5]
Y = [3, 3, 2, 6, 7, 1, 1, 2]
Z = [0, 0, 1, 1, 2, 3, 3, 4]

zipped = list(zip(X, Y, Z))
X, Y, Z = zip(*[i for i in zipped if zipped.count(i) == 1])
X, Y, Z = list(X), list(Y), list(Z)    

print(X, Y, Z, sep='\n')
# [2, 3, 4, 5]
# [2, 6, 7, 2]
# [1, 1, 2, 4]

Upvotes: 0

jpp
jpp

Reputation: 164663

Using collections.Counter and zip, you can count unique triplets.

Then remove duplicates via a generator comprehension.

from collections import Counter

X = [1, 1, 2, 3, 4, 5, 5, 5]
Y = [3, 3, 2, 6, 7, 1, 1, 2]
Z = [0, 0, 1, 1, 2, 3, 3, 4]

c = Counter(zip(X, Y, Z))

X, Y, Z = zip(*(k for k, v in c.items() if v == 1))

print(X, Y, Z, sep='\n')

(2, 3, 4, 5)
(2, 6, 7, 2)
(1, 1, 2, 4)

Note if ordering is a requirement and you are not using Python 3.6+, you can create an "OrderedCounter" instead by subclassing collections.OrderedDict.

Upvotes: 4

shanmuga
shanmuga

Reputation: 4499

Not the best possible approach

>>> from collections import Counter
>>> zipped_items = list(zip(x,y,z))
>>> counts = Counter(zipped_items)
>>> filtered_items = [item for item in zipped_items if counts[item] == 1]
>>> x1, y1, z1 = [ list(map(lambda x: x[i], filtered_items))
... for i in range(3)]

Upvotes: 0

koPytok
koPytok

Reputation: 3713

It's convenient to use pandas library for the task. Just create dataframe using the lists and apply df.drop_duplicates with keep=False (means remove all duplicated rows):

import pandas as pd

dct = {
"X": [1, 1, 2, 3, 4, 5, 5, 5],
"Y": [3, 3, 2, 6, 7, 1, 1, 2],
"Z": [0, 0, 1, 1, 2, 3, 3, 4],
}
d = pd.DataFrame(dct)
d.drop_duplicates(keep=False)

Upvotes: 1

Related Questions