Remove duplicates and similar values from a list in Python?

Question

This question is a follow up on How do you remove duplicates from a list in whilst preserving order?.

I need to remove duplicates and or similar values from a list:

I start from that question's answer and apply:

def f7(seq):
    seen = set()
    seen_add = seen.add
    return [ x for x in seq if x not in seen and not seen_add(x)]

but when I apply it to my data/array:, I get this which is clearly wrong because the values in bold are equal and one should be removed

 [(Decimal('1.20149'), Decimal('1.25900')),
 *(Decimal('1.13583'), Decimal('1.07862'))*,
**(Decimal('1.07016'), Decimal('1.17773'))**,
 *(Decimal('1.13582'), Decimal('1.07863'))*,
  (Decimal('1.07375'), Decimal('0.92410')),
  (Decimal('1.01167'), Decimal('1.00900')),
**(Decimal('1.07015'), Decimal('1.17773'))**,
  (Decimal('0.95318'), Decimal('1.10171')),
  (Decimal('1.01507'), Decimal('0.79170')),
  (Decimal('0.95638'), Decimal('0.86445')),
  (Decimal('0.90109'), Decimal('0.94387')),
  (Decimal('0.84900'), Decimal('1.03060'))]

How would you remove those values which are identical ?

shx2 · Accepted Answer

From the output, it looks like the seq you're passing contains 2-tuples. While the values inside the tuples may be the same, the tuples themselves (which are the elements of your sequence) are not, and therefor are not removed.

If your intention is to get a flat list of the unique numbers, you can flatten it first:

seq = [ (1,2), (2,3), (1,4) ]
f7(itertools.chain(*seq))
=> [1, 2, 3, 4]

Remove duplicates and similar values from a list in Python?

Answers (1)

Related Questions