Reputation: 4647
I have player and uniform data. Player name is the key, uniform number is the value. A player may wear multiple uniform numbers over their career.
{"Michael Jordan":45}
{"Michael Jordan":23}
This won't work because the second entry replaces the first:
myData = {}
myData["Michael Jordan"] = 45
myData["Michael Jordan"] = 23
I can't store that in a dictionary. What data structure should be used?
Upvotes: 1
Views: 214
Reputation: 15505
A dict of lists or a dict of sets is a pretty common structure, to have more than one value per key.
data = [('Michael Jordan', 45), ('Kobe Bryant', 8), ('Michael Jordan', 23), ('Kobe Bryant', 24), ('Michael Jordan', 23)]
d = {}
for player, uniform in data:
if player not in d:
d[player] = [uniform]
else:
d[player].append(uniform)
Personally I don't like the if / else
in that loop. You can get rid of it using the default value of dict.get
, or using dict.setdefault
, or using a defaultdict
:
# 1ST METHOD: DICT.GET WITH DEFAULT VALUE
d = {}
for player, uniform in data:
d[player] = d.get(player, [])
d[player].append(uniform)
# 2ND METHOD: DICT.SETDEFAULT
d = {}
for player, uniform in data:
d.setdefault(player, []).append(uniform)
# 3RD METHOD: DEFAULTDICT
from collections import defaultdict
d = defaultdict(list)
for player, uniform in data:
d[player].append(uniform)
In all cases, the result is the same:
# 1st and 2nd methods
print(d)
# {'Michael Jordan': [45, 23, 23], 'Kobe Bryant': [8, 24]}
# 3rd method
print(d)
# defaultdict(<class 'list'>, {'Michael Jordan': [45, 23, 23], 'Kobe Bryant': [8, 24]})
However, perhaps we don't like this duplicate value 23
in Michael Jordan's uniforms. If we don't care about the order of the values, but we care about getting rid of duplicates, then the go-to data structure we need is a set
, not a list
.
Our 4 possible methods become:
d = {}
for player, uniform in data:
if player not in d:
d[player] = {uniform}
else:
d[player].add(uniform)
# 1ST METHOD: DICT.GET WITH DEFAULT VALUE
d = {}
for player, uniform in data:
d[player] = d.get(player, set())
d[player].add(uniform)
# 2ND METHOD: DICT.SETDEFAULT
d = {}
for player, uniform in data:
d.setdefault(player, set()).add(uniform)
# 3RD METHOD: DEFAULTDICT
from collections import defaultdict
d = defaultdict(set)
for player, uniform in data:
d[player].add(uniform)
# RESULT
# 1st and 2nd methods
print(d)
# {'Michael Jordan': {45, 23}, 'Kobe Bryant': {8, 24}}
# 3rd method
print(d)
# defaultdict(<class 'set'>, {'Michael Jordan': {45, 23}, 'Kobe Bryant': {8, 24}})
Finally, I'd like to show one more method, which is to use function map_reduce
from module more_itertools
:
from operator import itemgetter
from more_itertools import map_reduce
# DICT OF LIST
d = map_reduce(data, keyfunc=itemgetter(0), valuefunc=itemgetter(1))
print(d)
# defaultdict(None, {'Michael Jordan': [45, 23, 23], 'Kobe Bryant': [8, 24]})
# DICT OF SET
d = map_reduce(data, keyfunc=itemgetter(0), valuefunc=itemgetter(1), reducefunc=set)
print(d)
# defaultdict(None, {'Michael Jordan': {45, 23}, 'Kobe Bryant': {8, 24}})
Upvotes: 2