Ningxi
Ningxi

Reputation: 157

python remove duplicate value in dictionary and change key

I am trying to remove the duplicate value in a complicated dictionary

a = {0:{'time':11}, 1:{'time':12}, 2:{'time':12}, 3:{'time':13}}

is there any way to remove 2:{'time':12}, and get

b = {0:{'time':11}, 1:{'time':12}, 2:{'time':13}}

my code is

m = {}
for key, value in a.items():
    if key == 0:
        m[0] = value
elif a[key] != a[key -1]:
    m[key] = value

but the result is {0: {'time': 11}, 1: {'time': 12}, 3: {'time': 13}} I am wondering if there is any way to get the result as dict b, and if there is any way to do this faster, because I have a lot of data to deal with. Any help will be appreciated!

Upvotes: 1

Views: 2293

Answers (6)

JuniorCompressor
JuniorCompressor

Reputation: 20025

First let's create a list of all times:

>>> c = [a[k]['time'] for k in sorted(a)]
>>> c
[11, 12, 12, 13]

Then lets use groupby to group by consecutive equal values:

>>> from itertools import groupby
>>> d = [x for x, y in groupby(c)]
>>> d
[11, 12, 13]

Now we can zip the keys with the new values and create a dictionary:

 >>> dict(zip(sorted(a), d))
 {0: 11, 1: 12, 2: 13}

We can combine all steps:

>>> keys = sorted(a)
>>> dict(zip(keys, (x for x, y in groupby(a[k]['time'] for k in keys))))
{0: 11, 1: 12, 2: 13}

Upvotes: 3

Do groupby on the items sorted by the time value, then enumerate into a dictionary

from itertools import groupby

a = {0:{'time':11}, 1:{'time':12}, 2:{'time':12}, 3:{'time':13}}
b = dict(enumerate(next(i[1])[1] for i in
          groupby(sorted(a.items(),
                         key=lambda i: i[1]['time']),
                         lambda i: i[1]['time'])))

b is now

{0: {'time': 11}, 1: {'time': 12}, 2: {'time': 13}}

though I seriously question the usability of such a structure for this kind of task.

Upvotes: 1

BWStearns
BWStearns

Reputation: 2706

You can remove elements from a dict with pop, and assign them with the myDict[key] syntax so this is simply

a = {0:{'time':11}, 1:{'time':12}, 2:{'time':12}, 3:{'time':13}}
a[2] = a.pop(3)
a = {0: {'time': 11}, 1: {'time': 12}, 2: {'time': 13}}

From the example though it is unclear that this is the right way to organize your dictionary for your task. For example if the only values in your dictionary are {'time': <someNumber>} why not simply have the value be the someNumber?

Also as noted elsewhere looping over a dictionary where ordering is needed is a very bad idea as elements are not guaranteed to be in the order you think they are.

Upvotes: 0

gustafbstrom
gustafbstrom

Reputation: 1933

A dictionary is probably not what you want for this kind of task. Instead, use a heap priority queue, which is an efficient, self-ordering alternative, depending on some key of your choice. python.org – Heap queue algorithm

Upvotes: 0

Reut Sharabani
Reut Sharabani

Reputation: 31339

Use a reversed mapping to the minimal index:

for k, v in a.iteritems():
    key = v['time']
    # we want the minimal index of the item
    reverse_mapping[key] = min(v, reverse_mapping.get(key, k))

Now reverse the mapping again after you've filtered needless items:

reversed_original = {v: k for k, v in reverse_mapping.iteritems()}

Now create a new list of items based on filtered list and use the dict constructor to restore a dict from it:

result = dict([(x, {'time': v[1]}) for x, v in enumerate(
    sorted(reversed_original.iteritems())
)])

Output:

{0: {'time': 11}, 1: {'time': 12}, 2: {'time': 13}}

Upvotes: 0

user2722670
user2722670

Reputation: 39

Instead of saving the value as a separate dictionary item could you instead use a tuple? For example, your key could be 1, 2, etc. and your value ('time',11) or ('time',12). I'm really assuming that your values are not going to be changed so an immutable data type like tuple could be a solution.

Upvotes: 0

Related Questions