skyking
skyking

Reputation: 14358

Why does `dict_display` allow duplicate keys?

I don't mean because the standard says so, but the rationale for it. The specification says:

If a comma-separated sequence of key/datum pairs is given, they are evaluated from left to right to define the entries of the dictionary: each key object is used as a key into the dictionary to store the corresponding datum. This means that you can specify the same key multiple times in the key/datum list, and the final dictionary’s value for that key will be the last one given.

This means that it's perfectly legal to form a dict by:

d = { 'a': 1, 'b':2, 'b':3 }

However I hardly see any reason why one would like to define it that way, most often I'd guess that was a mistake. If you compare keyword arguments to a function the corresponding construct is forbidden.

Is there a good way to avoid this?

Upvotes: 1

Views: 253

Answers (3)

Chris_Rands
Chris_Rands

Reputation: 41168

There are at least several cases where you might use this behavior of a dict accepting multiple identical keys because the dictionary display is now evaluated left to right.

1) If multiple keys evaluate to the same output, but you only want to take the last instance.

For example, imagine you want to display a number if it's even or else 'odd'; you could use dictionary:

def f(n):
    return {True: n, n % 2: 'odd'}[True]

Of course there are more readable ways for this example, like using an if-else clause, but it illustrates the point.

2) With OrderedDict, taking advantage of this behavior is the recommended way to remove duplicates from a list while preserving the order, as Raymond Hettinger says. For example:

from collections import OrderedDict
list(OrderedDict.fromkeys(['a','b','d','d','a']))
# ['a', 'b', 'd']

If you are concerned about the behavior then you should simply check your keys are unique before building the dictionary, for example assert len(keys) == len(set(keys)). Of you could check the key is not in the dictionary before adding it, if key not in my_dict: my_dict[key] = value.

Upvotes: 1

Mark
Mark

Reputation: 19969

I found this discussion, which raises this point:

d = {spam(a): 'a', spam(b): 'BB', spam(c): 'Ccc'}

Which not only highlights that this must be a runtime thing, but also that there are cases where you might want to allow it. For example, when code is being generated, or for dict comprehensions which overwrite defaults, etc.

defaults = {'a': 1, 'b': 2}
specific = {'b': 3, 'c': 4}
combined = {key: val for key, val in itertools.chain(defaults.items(), specific.items())}

As a personal note, it also fits well with .update, which adds or updates a key, not complaining when it already exists.

As for a way to prevent this, when keys are valid python keywords, you can use:

d = dict(a=1, b=2, b=3)

You can of course make your own wrapper, but it'll look ugly:

def uniqdict(items):
    dct = {}
    for key, val in items:
        if key in dct:
            raise KeyError('key {0:} already exists'.format(key))
        dct[key] = val
    return dct

uniqdict((('a', 1), ('b', 2), ('b', 3)))

Upvotes: 2

Uriel
Uriel

Reputation: 16184

This "bug" had been reported, discussed and finally rejected - see https://bugs.python.org/issue16385.

The main reasons specified by the rejectors were that

A code generator could depend on being able to write duplicate keys without having to go back and erase previous output.

and that

An error is out of the question for compatibility reasons.

Upvotes: 3

Related Questions