nicomp
nicomp

Reputation: 4647

What data structure acts like a dictionary and supports multuiple values per key?

I have player and uniform data. Player name is the key, uniform number is the value. A player may wear multiple uniform numbers over their career.

{"Michael Jordan":45}
{"Michael Jordan":23}

This won't work because the second entry replaces the first:

myData = {}
myData["Michael Jordan"] = 45
myData["Michael Jordan"] = 23

I can't store that in a dictionary. What data structure should be used?

Upvotes: 1

Views: 214

Answers (1)

Stef
Stef

Reputation: 15505

A dict of lists or a dict of sets is a pretty common structure, to have more than one value per key.

data = [('Michael Jordan', 45), ('Kobe Bryant', 8), ('Michael Jordan', 23), ('Kobe Bryant', 24), ('Michael Jordan', 23)]

d = {}
for player, uniform in data:
    if player not in d:
        d[player] = [uniform]
    else:
        d[player].append(uniform)

Personally I don't like the if / else in that loop. You can get rid of it using the default value of dict.get, or using dict.setdefault, or using a defaultdict:

# 1ST METHOD: DICT.GET WITH DEFAULT VALUE
d = {}
for player, uniform in data:
    d[player] = d.get(player, [])
    d[player].append(uniform)

# 2ND METHOD: DICT.SETDEFAULT
d = {}
for player, uniform in data:
    d.setdefault(player, []).append(uniform)

# 3RD METHOD: DEFAULTDICT
from collections import defaultdict
d = defaultdict(list)
for player, uniform in data:
    d[player].append(uniform)

In all cases, the result is the same:

# 1st and 2nd methods
print(d)
# {'Michael Jordan': [45, 23, 23], 'Kobe Bryant': [8, 24]}

# 3rd method
print(d)
# defaultdict(<class 'list'>, {'Michael Jordan': [45, 23, 23], 'Kobe Bryant': [8, 24]})

However, perhaps we don't like this duplicate value 23 in Michael Jordan's uniforms. If we don't care about the order of the values, but we care about getting rid of duplicates, then the go-to data structure we need is a set, not a list.

Our 4 possible methods become:

d = {}
for player, uniform in data:
    if player not in d:
        d[player] = {uniform}
    else:
        d[player].add(uniform)

# 1ST METHOD: DICT.GET WITH DEFAULT VALUE
d = {}
for player, uniform in data:
    d[player] = d.get(player, set())
    d[player].add(uniform)

# 2ND METHOD: DICT.SETDEFAULT
d = {}
for player, uniform in data:
    d.setdefault(player, set()).add(uniform)

# 3RD METHOD: DEFAULTDICT
from collections import defaultdict
d = defaultdict(set)
for player, uniform in data:
    d[player].add(uniform)

# RESULT

# 1st and 2nd methods
print(d)
# {'Michael Jordan': {45, 23}, 'Kobe Bryant': {8, 24}}

# 3rd method
print(d)
# defaultdict(<class 'set'>, {'Michael Jordan': {45, 23}, 'Kobe Bryant': {8, 24}})

Finally, I'd like to show one more method, which is to use function map_reduce from module more_itertools:

from operator import itemgetter
from more_itertools import map_reduce

# DICT OF LIST
d = map_reduce(data, keyfunc=itemgetter(0), valuefunc=itemgetter(1))
print(d)
# defaultdict(None, {'Michael Jordan': [45, 23, 23], 'Kobe Bryant': [8, 24]})

# DICT OF SET
d = map_reduce(data, keyfunc=itemgetter(0), valuefunc=itemgetter(1), reducefunc=set)
print(d)
# defaultdict(None, {'Michael Jordan': {45, 23}, 'Kobe Bryant': {8, 24}})

Upvotes: 2

Related Questions