Reputation: 681
I have three lists X, Y, Z as follows:
X: [1, 1, 2, 3, 4, 5, 5, 5]
Y: [3, 3, 2, 6, 7, 1, 1, 2]
Z: [0, 0, 1, 1, 2, 3, 3, 4]
I am trying to remove both duplicated set of values at the same index of the lists get a reduced list as follows, all three list will always have the same length initially and at the end as well:
X: [2, 3, 4, 5]
Y: [2, 6, 7, 2]
Z: [1, 1, 2, 4]
I tried using the zip(X, Y, Z) function but I can't index it and the dict.fromkeys only removes one of the duplicates and leaves the other in the new list. I want to be able to remove both.
Any help is appreciated!
Upvotes: 4
Views: 2497
Reputation: 6526
Here is my solution without any import, but still short and easily readable:
X = [1, 1, 2, 3, 4, 5, 5, 5]
Y = [3, 3, 2, 6, 7, 1, 1, 2]
Z = [0, 0, 1, 1, 2, 3, 3, 4]
zipped = list(zip(X, Y, Z))
X, Y, Z = zip(*[i for i in zipped if zipped.count(i) == 1])
X, Y, Z = list(X), list(Y), list(Z)
print(X, Y, Z, sep='\n')
# [2, 3, 4, 5]
# [2, 6, 7, 2]
# [1, 1, 2, 4]
Upvotes: 0
Reputation: 164663
Using collections.Counter
and zip
, you can count unique triplets.
Then remove duplicates via a generator comprehension.
from collections import Counter
X = [1, 1, 2, 3, 4, 5, 5, 5]
Y = [3, 3, 2, 6, 7, 1, 1, 2]
Z = [0, 0, 1, 1, 2, 3, 3, 4]
c = Counter(zip(X, Y, Z))
X, Y, Z = zip(*(k for k, v in c.items() if v == 1))
print(X, Y, Z, sep='\n')
(2, 3, 4, 5)
(2, 6, 7, 2)
(1, 1, 2, 4)
Note if ordering is a requirement and you are not using Python 3.6+, you can create an "OrderedCounter" instead by subclassing collections.OrderedDict
.
Upvotes: 4
Reputation: 4499
Not the best possible approach
>>> from collections import Counter
>>> zipped_items = list(zip(x,y,z))
>>> counts = Counter(zipped_items)
>>> filtered_items = [item for item in zipped_items if counts[item] == 1]
>>> x1, y1, z1 = [ list(map(lambda x: x[i], filtered_items))
... for i in range(3)]
Upvotes: 0
Reputation: 3713
It's convenient to use pandas library for the task. Just create dataframe using the lists and apply df.drop_duplicates
with keep=False
(means remove all duplicated rows):
import pandas as pd
dct = {
"X": [1, 1, 2, 3, 4, 5, 5, 5],
"Y": [3, 3, 2, 6, 7, 1, 1, 2],
"Z": [0, 0, 1, 1, 2, 3, 3, 4],
}
d = pd.DataFrame(dct)
d.drop_duplicates(keep=False)
Upvotes: 1