Reputation: 7961
I have this dataset. I want to update the MySQL table. I can do it in the current form but I thought a conversion to dictionary will shrink the list to be updated.
My dataset :
dataset = [('121', set(['NY'])), ('132', set(['CA', 'NY'])), ('198', set(['NY'])), ('676', set(['NY'])), ('89', set(['NY', 'CA']))]
Desired output :
A dictionary :
output = {'set(['NY'])':121,198,676, 'set(['CA', 'NY'])':132,89}
Upvotes: 2
Views: 722
Reputation: 304137
You must use a frozenset for the key. There is no guarantee that a set with the same elements will always be turned into the same repr
or tuple
as sets are unordered. Unless you sort the set elements first of course, but that seems wasteful
from collections import defaultdict
dataset = [('121', set(['NY'])), ('132', set(['CA', 'NY'])), ('198', set(['NY'])), ('676', set(['NY'])), ('89', set(['NY', 'CA']))]
output = defaultdict(list)
for value, key in dataset:
output[frozenset(key)].append(value)
or using a sorted tuple
from collections import defaultdict
dataset = [('121', set(['NY'])), ('132', set(['CA', 'NY'])), ('198', set(['NY'])), ('676', set(['NY'])), ('89', set(['NY', 'CA']))]
output = defaultdict(list)
for value, key in dataset:
output[tuple(sorted(key))].append(value)
Random example to illustrate this
>>> s,t = set([736, 9753, 7126, 7907, 3350]), set([3350, 7907, 7126, 9753, 736])
>>> s == t
True
>>> tuple(s) == tuple(t)
False
>>> frozenset(s) == frozenset(t)
True
>>> hash(tuple(s)) == hash(tuple(t))
False
>>> hash(frozenset(s)) == hash(frozenset(t))
True
Upvotes: 5
Reputation: 208425
Here is an alternative to defaultdict
:
dataset = [('121', set(['NY'])), ('132', set(['CA', 'NY'])), ('198', set(['NY'])), ('676', set(['NY'])), ('89', set(['NY', 'CA']))]
output = {}
for value, key in dataset:
output.setdefault(frozenset(key), []).append(value)
Result:
>>> output
{frozenset(['NY', 'CA']): ['132', '89'], frozenset(['NY']): ['121', '198', '676']}
I prefer using setdefault()
over defaultdict
here because of the following behavior:
>>> output = defaultdict(list, {frozenset(['NY', 'CA']): ['132', '89'], frozenset(['NY']): ['121', '198', '676']})
>>> output[frozenset(['FL'])] # instead of a key error, this modifies output
[]
>>> output
defaultdict(<type 'list'>, {frozenset(['NY', 'CA']): ['132', '89'], frozenset(['FL']): [], frozenset(['NY']): ['121', '198', '676']})
Upvotes: 1
Reputation: 235994
Try this:
dataset = [('121', set(['NY'])), ('132', set(['CA', 'NY'])), ('198', set(['NY'])), ('676', set(['NY'])), ('89', set(['NY', 'CA']))]
from collections import defaultdict
d = defaultdict(list)
for val, key in dataset:
d[repr(key)].append(int(val))
d
> {"set(['NY', 'CA'])": [132, 89], "set(['NY'])": [121, 198, 676]}
Upvotes: 1
Reputation: 54242
I don't think you can have a set
as a dictionary key, so maybe a tuple?
from collections import defaultdict
dataset = [('121', set(['NY'])), ('132', set(['CA', 'NY'])), ('198', set(['NY'])), ('676', set(['NY'])), ('89', set(['NY', 'CA']))]
output = defaultdict(list)
for value, key in dataset:
output[tuple(key)].append(value)
# or output[str(key)].append(value) if you want a string as the key
Upvotes: 2