ThinkCode
ThinkCode

Reputation: 7961

How do I convert a list of tuples with sets to a dictionary?

I have this dataset. I want to update the MySQL table. I can do it in the current form but I thought a conversion to dictionary will shrink the list to be updated.

My dataset :

dataset = [('121', set(['NY'])), ('132', set(['CA', 'NY'])), ('198', set(['NY'])), ('676', set(['NY'])), ('89', set(['NY', 'CA']))]

Desired output :

A dictionary :

output = {'set(['NY'])':121,198,676, 'set(['CA', 'NY'])':132,89}

Upvotes: 2

Views: 722

Answers (4)

John La Rooy
John La Rooy

Reputation: 304137

You must use a frozenset for the key. There is no guarantee that a set with the same elements will always be turned into the same repr or tuple as sets are unordered. Unless you sort the set elements first of course, but that seems wasteful

from collections import defaultdict

dataset = [('121', set(['NY'])), ('132', set(['CA', 'NY'])), ('198', set(['NY'])), ('676', set(['NY'])), ('89', set(['NY', 'CA']))]
output = defaultdict(list)
for value, key in dataset:
    output[frozenset(key)].append(value)

or using a sorted tuple

from collections import defaultdict

dataset = [('121', set(['NY'])), ('132', set(['CA', 'NY'])), ('198', set(['NY'])), ('676', set(['NY'])), ('89', set(['NY', 'CA']))]
output = defaultdict(list)
for value, key in dataset:
    output[tuple(sorted(key))].append(value)

Random example to illustrate this

>>> s,t = set([736, 9753, 7126, 7907, 3350]), set([3350, 7907, 7126, 9753, 736])
>>> s == t
True
>>> tuple(s) == tuple(t)
False
>>> frozenset(s) == frozenset(t)
True
>>> hash(tuple(s)) == hash(tuple(t))
False
>>> hash(frozenset(s)) == hash(frozenset(t))
True

Upvotes: 5

Andrew Clark
Andrew Clark

Reputation: 208425

Here is an alternative to defaultdict:

dataset = [('121', set(['NY'])), ('132', set(['CA', 'NY'])), ('198', set(['NY'])), ('676', set(['NY'])), ('89', set(['NY', 'CA']))]    

output = {}
for value, key in dataset:
   output.setdefault(frozenset(key), []).append(value)

Result:

>>> output
{frozenset(['NY', 'CA']): ['132', '89'], frozenset(['NY']): ['121', '198', '676']}

I prefer using setdefault() over defaultdict here because of the following behavior:

>>> output = defaultdict(list, {frozenset(['NY', 'CA']): ['132', '89'], frozenset(['NY']): ['121', '198', '676']})
>>> output[frozenset(['FL'])]    # instead of a key error, this modifies output
[]
>>> output
defaultdict(<type 'list'>, {frozenset(['NY', 'CA']): ['132', '89'], frozenset(['FL']): [], frozenset(['NY']): ['121', '198', '676']})

Upvotes: 1

&#211;scar L&#243;pez
&#211;scar L&#243;pez

Reputation: 235994

Try this:

dataset = [('121', set(['NY'])), ('132', set(['CA', 'NY'])), ('198', set(['NY'])), ('676', set(['NY'])), ('89', set(['NY', 'CA']))]
from collections import defaultdict
d = defaultdict(list)

for val, key in dataset:
    d[repr(key)].append(int(val))

d
> {"set(['NY', 'CA'])": [132, 89], "set(['NY'])": [121, 198, 676]}

Upvotes: 1

Brendan Long
Brendan Long

Reputation: 54242

I don't think you can have a set as a dictionary key, so maybe a tuple?

from collections import defaultdict

dataset = [('121', set(['NY'])), ('132', set(['CA', 'NY'])), ('198', set(['NY'])), ('676', set(['NY'])), ('89', set(['NY', 'CA']))]
output = defaultdict(list)
for value, key in dataset:
    output[tuple(key)].append(value)
    # or output[str(key)].append(value) if you want a string as the key

Upvotes: 2

Related Questions