Reputation: 157
I am trying to remove the duplicate value in a complicated dictionary
a = {0:{'time':11}, 1:{'time':12}, 2:{'time':12}, 3:{'time':13}}
is there any way to remove 2:{'time':12}, and get
b = {0:{'time':11}, 1:{'time':12}, 2:{'time':13}}
my code is
m = {}
for key, value in a.items():
if key == 0:
m[0] = value
elif a[key] != a[key -1]:
m[key] = value
but the result is {0: {'time': 11}, 1: {'time': 12}, 3: {'time': 13}}
I am wondering if there is any way to get the result as dict b, and if there is any way to do this faster, because I have a lot of data to deal with. Any help will be appreciated!
Upvotes: 1
Views: 2293
Reputation: 20025
First let's create a list of all times:
>>> c = [a[k]['time'] for k in sorted(a)]
>>> c
[11, 12, 12, 13]
Then lets use groupby
to group by consecutive equal values:
>>> from itertools import groupby
>>> d = [x for x, y in groupby(c)]
>>> d
[11, 12, 13]
Now we can zip the keys with the new values and create a dictionary:
>>> dict(zip(sorted(a), d))
{0: 11, 1: 12, 2: 13}
We can combine all steps:
>>> keys = sorted(a)
>>> dict(zip(keys, (x for x, y in groupby(a[k]['time'] for k in keys))))
{0: 11, 1: 12, 2: 13}
Upvotes: 3
Reputation: 133929
Do groupby
on the items sorted
by the time
value, then enumerate
into a dictionary
from itertools import groupby
a = {0:{'time':11}, 1:{'time':12}, 2:{'time':12}, 3:{'time':13}}
b = dict(enumerate(next(i[1])[1] for i in
groupby(sorted(a.items(),
key=lambda i: i[1]['time']),
lambda i: i[1]['time'])))
b
is now
{0: {'time': 11}, 1: {'time': 12}, 2: {'time': 13}}
though I seriously question the usability of such a structure for this kind of task.
Upvotes: 1
Reputation: 2706
You can remove elements from a dict with pop
, and assign them with the myDict[key]
syntax so this is simply
a = {0:{'time':11}, 1:{'time':12}, 2:{'time':12}, 3:{'time':13}}
a[2] = a.pop(3)
a = {0: {'time': 11}, 1: {'time': 12}, 2: {'time': 13}}
From the example though it is unclear that this is the right way to organize your dictionary for your task. For example if the only values in your dictionary are {'time': <someNumber>}
why not simply have the value be the someNumber
?
Also as noted elsewhere looping over a dictionary where ordering is needed is a very bad idea as elements are not guaranteed to be in the order you think they are.
Upvotes: 0
Reputation: 1933
A dictionary is probably not what you want for this kind of task. Instead, use a heap priority queue, which is an efficient, self-ordering alternative, depending on some key of your choice. python.org – Heap queue algorithm
Upvotes: 0
Reputation: 31339
Use a reversed mapping to the minimal index:
for k, v in a.iteritems():
key = v['time']
# we want the minimal index of the item
reverse_mapping[key] = min(v, reverse_mapping.get(key, k))
Now reverse the mapping again after you've filtered needless items:
reversed_original = {v: k for k, v in reverse_mapping.iteritems()}
Now create a new list of items based on filtered list and use the dict constructor to restore a dict
from it:
result = dict([(x, {'time': v[1]}) for x, v in enumerate(
sorted(reversed_original.iteritems())
)])
Output:
{0: {'time': 11}, 1: {'time': 12}, 2: {'time': 13}}
Upvotes: 0
Reputation: 39
Instead of saving the value as a separate dictionary item could you instead use a tuple? For example, your key could be 1, 2, etc. and your value ('time',11) or ('time',12). I'm really assuming that your values are not going to be changed so an immutable data type like tuple could be a solution.
Upvotes: 0