d'chang
d'chang

Reputation: 191

Iterator Object for Removing Duplicates in Python

Hi so I'm trying to figure out how to create an iterator object using Python that would remove duplicates or more so omit duplicates.

For example I have a list (1, 2, 3, 3, 4, 4, 5) and I get (1, 2, 3, 4, 5)

I understand that in order to get an iterator object I have to create it. So:

Class Unique:
    def __init__(self, n):
         self.i = 0
         self.n = n  

    def __iter__(self):
         return self

    def __next__(self):
        if self.i < self.n:

I'm actually not entirely sure what to do next in this problem. Thanks in advance for any comments or help!

Upvotes: 3

Views: 2830

Answers (3)

Toothpick Anemone
Toothpick Anemone

Reputation: 4644

This should remove all duplicates

new_stuff = type(old_stuff)(set(old_stuff))

Upvotes: -1

thefourtheye
thefourtheye

Reputation: 239443

Better create a generator function, like this

>>> def unique_values(iterable):
...     seen = set()
...     for item in iterable:
...         if item not in seen:
...             seen.add(item)
...             yield item
... 

And then you can create a tuple of unique values, like this

>>> tuple(unique_values((1, 2, 3, 3, 4, 4, 5)))
(1, 2, 3, 4, 5)

If you know for sure that the data will be always sorted, then you can avoid creating the set and keep track of the previous data only, like this

>>> def unique_values(iterable):
...     it = iter(iterable)
...     previous = next(it)
...     yield previous
...     for item in it:
...         if item != previous:
...             previous = item
...             yield item
>>> tuple(unique_values((1, 2, 3, 3, 4, 4, 5)))
(1, 2, 3, 4, 5)

You can write an iterator object, with a class, like this

>>> class Unique:
...     def __init__(self, iterable):
...         self.__it = iter(iterable)
...         self.__seen = set()
... 
...     def __iter__(self):
...         return self
... 
...     def __next__(self):
...         while True:
...             next_item = next(self.__it)
...             if next_item not in self.__seen:
...                 self.__seen.add(next_item)
...                 return next_item
... 
>>> for item in Unique((1, 2, 3, 3, 4, 4, 5)):
...     print(item)
... 
1
2
3
4
5

You can refer this answer, and the Iterator Types section in Python 3 Data Model documentation

Upvotes: 8

Jimothy
Jimothy

Reputation: 9730

If preserving original order is not important, simply use set:

values = (1, 3, 2, 5, 4, 3)
unique_values = set(values)
print unique_values
(1, 2, 3, 4, 5)

Upvotes: 0

Related Questions