vkaul11
vkaul11

Reputation: 4214

set and frozenset difference in implementation

I checked on this link that set is mutable https://docs.python.org/3/library/stdtypes.html#frozenset while frozenset is immutable and hence hashable. So how is the set implemented in python and what is the element look up time? Actually I had a list of tuples [(1,2),(3,4),(2,1)] where each entry in the tuple is a id and I wanted to create a set/frozenset out of the this list. In this case the set should contain (1,2,3,4) as elements. Can I use frozenset to insert elements into it one by one from the list of tuples or I can only use a set?

Upvotes: 15

Views: 10274

Answers (3)

John La Rooy
John La Rooy

Reputation: 304205

You can instantiate a frozenset from a generator expression or other iterable. It's not immutable until it's finished being instantiated.

>>> L = [(1,2),(3,4),(2,1)]
>>> from itertools import chain
>>> frozenset(chain.from_iterable(L))
frozenset([1, 2, 3, 4])

Python3.3 also has an optimisation that turns set literals such as {1, 2, 3, 4} into precomputed frozensets when used as the right-hand side of an in operator.

Upvotes: 8

user2357112
user2357112

Reputation: 280887

Sets and frozensets are implemented the same way, as hashtables. (Why else would they require their elements to implement __hash__?) In fact, if you look at Objects/setobject.c, they share almost all their code. This means that as long as hash collisions don't get out of hand, lookup and deletion are O(1) and insertion is amortized O(1).

The usual way to create a frozenset is to initialize it with some other iterable. As gnibbler suggested, the best fit here would probably be itertools.chain.from_iterable:

>>> L = [(1,2),(3,4),(2,1)]
>>> from itertools import chain
>>> frozenset(chain.from_iterable(L))
frozenset([1, 2, 3, 4])

Upvotes: 5

Dolda2000
Dolda2000

Reputation: 25855

As for your first question, I haven't actually checked the source, but it seems safe to assume from the fact that sets need to contain objects of hashable types, that it is implemented using a hash table, and that its lookup time is, therefore, O(1).

As for your second question, you cannot insert the elements into a frozenset one by one (obviously, since it's immutable), but there's no reason to use a set instead; just construct it from a list (or other iterable) of the constituent values, e.g. like this:

data = [(1, 2), (3, 4), (2, 1)]
result = frozenset(reduce(list.__add__, [list(x) for x in data], []))

Upvotes: -2

Related Questions