Clifford
Clifford

Reputation: 65

Is there something simple like a set for un-hashable objects?

For hashable objects inside a dict I could easily pair down duplicate values store in a dict using a set. For example:

a = {'test': 1, 'key': 1, 'other': 2}
b = set(a.values())
print(b)

Would display [1,2]

Problem I have is I am using a dict to store mapping between variable keys in __dict__ and the corresponding processing functions that will be passed to an engine to order and process those functions, some of these functions may be fast some may be slower due to accessing an API. The problem is each function may use multiple variable, therefor need multiple mappings in the dict. I'm wondering if there is a way to do this or if I am stuck writing my own solution?


Ended up building a callable class, since caching could speed things up for me:

from collections.abc import Callable


class RemoveDuplicates(Callable):
    input_cache = []
    output_cache = []

    def __call__(self, in_list):
        if list in self.input_cache:
            idx = self.input_cache.index(in_list)
            return self.output_cache[idx]
        else:
            self.input_cache.append(in_list)
            out_list = self._remove_duplicates(in_list)
            self.output_cache.append(out_list)
            return out_list

    def _remove_duplicates(self, src_list):
        result = []
        for item in src_list:
            if item not in result:
                result.append(item)
        return result

Upvotes: 0

Views: 75

Answers (3)

martineau
martineau

Reputation: 123473

You could (indirectly) use the bisect module to create sorted collection of your values which would greatly speed-up the insertion of new values and value membership testing in general — which together can be utilized to unsure that only unique values get put into it.

In the code below, I've used un-hashable set values for the sake of illustration.

# see http://code.activestate.com/recipes/577197-sortedcollection
from sortedcollection import SortedCollection

a = {'test': {1}, 'key': {1}, 'other': {2}}

sc = SortedCollection()
for value in a.values():
    if value not in sc:
        sc.insert(value)

print(list(sc))  # --> [{1}, {2}]

Upvotes: 0

Mark Ransom
Mark Ransom

Reputation: 308206

If the objects can be ordered, you can use itertools.groupby to eliminate the duplicates:

>>> a = {'test': 1, 'key': 1, 'other': 2}
>>> b = [k for k, it in itertools.groupby(sorted(a.values()))]
>>> print(b)
[1, 2]

Upvotes: 1

Abhijit
Abhijit

Reputation: 63737

Is there something simple like a set for un-hashable objects

Not in the standard library but you need to look beyond and search for BTree implementation of dictionary. I googled and found few hits where the first one (BTree)seems promising and interesting

Quoting from the wiki

The BTree-based data structures differ from Python dicts in several fundamental ways. One of the most important is that while dicts require that keys support hash codes and equality comparison, the BTree-based structures don’t use hash codes and require a total ordering on keys.

Off-course its trivial fact that a set can be implemented as a dictionary where the value is unused.

Upvotes: 0

Related Questions