Reputation: 185
I am rearranging some Ordered Dictionary based on the key from a list. Such in:
old_OD = OrderedDict([('cat_1',1),
('dog_1',2),
('cat_2',3),
('fish_1',4),
('dog_2',5)])
Now I have a list of the group's order.
order = ['dog', 'cat', 'fish']
and get the result with the items in the dictionary grouped together, as such:
new_OD = OrderedDict([('dog_1',2),
('dog_2',5),
('cat_1',1),
('cat_2',3),
('fish_1',4)])
I found some excellent related question How to reorder OD based on list and Re-ordering OrderedDict and I am going with the solution in the second link
new_od = OrderedDict([(k, None) for k in order if k in old_od])
new_od.update(old_od)
Now, in my case, "k" is not exact match and desired key value for the new_od, how should I modify to construct the new od?
EDIT: So what happen if there is no underscore that mark the location of the keyword, like we have "Big_cat_3" or "dog_black_2"? The keyword could be anywhere in the string. Once the key are grouped together, alpha-numerical order is not needed.
Upvotes: 8
Views: 309
Reputation: 106553
A more efficient approach to solve this problem in a time complexity of O(n) (instead of O(n log n) with sorting) is to build a dict that maps the substring of each key that appears in order
(which should be converted to a set for efficient lookups) to a list of belonging key-value pairs from old_OD
, and then build the new OrderedDict
by iterating an index through a range of the length of order
and output to the OrderedDict
constructor the values in the mapping dict keyed by the value of order
at the index:
keys = set(order)
mapping = {}
for k, v in old_OD.items():
mapping.setdefault(next(i for i in k.split('_') if i in keys), []).append((k, v))
OrderedDict(t for i in range(len(order)) for t in mapping[order[i]])
This returns:
OrderedDict([('dog_1', 2), ('dog_2', 5), ('cat_1', 1), ('cat_2', 3), ('fish_1', 4)])
Upvotes: 0
Reputation: 106553
You can build a dict that maps each item in order
to its index, and then use the sorted
function with a key function that maps the substring of the each key in old_OD
that appears in the keys of the mapping dict to the corresponding index using the mapping dict:
keys = {k: i for i, k in enumerate(order)}
OrderedDict(sorted(old_OD.items(), key=lambda t: keys.get(next(i for i in t[0].split('_') if i in keys))))
This returns:
OrderedDict([('dog_1', 2), ('dog_2', 5), ('cat_1', 1), ('cat_2', 3), ('fish_1', 4)])
Upvotes: 2
Reputation: 48077
Here I am sharing two variants of solution for this.
1. For keys with same prefix, keep the order of initial OrderedDict
Here I am using list comprehension to iterate the order
list and OrderDict
. Based on comparison, we are passing list of tuples with desired order for creating OrderedDict
object:
>>> from collections import OrderedDict
>>> old_OD = OrderedDict([('cat_1',1),
... ('dog_1',2),
... ('cat_2',3),
... ('fish_1',4),
... ('dog_2',5)])
>>> order = ['dog', 'cat', 'fish']
>>> new_OD = OrderedDict([(k,v) for o in order for k, v in old_OD.items() if k.startswith(o+'_')])
# to match the prefix pattern of <key> + "_" ^
where new_OD
will hold:
OrderedDict([('dog_1', 2), ('dog_2', 5), ('cat_1', 1), ('cat_2', 3), ('fish_1', 4)])
2. For keys with same prefix, perform lexicographical sorting of elements
We may modify the above solution using sorted
and itertools.chain
with nested list comprehension to achieve this as:
>>> from itertools import chain
>>> new_OD = OrderedDict(chain(*[sorted([(k,v) for k, v in old_OD.items() if k.startswith(o+'_')]) for o in order]))
where new_OD
will hold:
OrderedDict([('dog_1', 2), ('dog_2', 5), ('cat_1', 1), ('cat_2', 3), ('fish_1', 4)])
Upvotes: 2
Reputation: 17824
You can use the function groupby()
with a sorted dictionary:
from collections import OrderedDict
from itertools import groupby, chain
from operator import itemgetter
ld_OD = OrderedDict([('cat_1',1),
('dog_1',2),
('cat_2',3),
('fish_1',4),
('dog_2',5)])
order = ['dog', 'cat', 'fish']
gb = groupby(sorted(ld_OD.items()), key=lambda t: t[0].split('_')[0])
gb = {k: list(g) for k, g in gb}
OrderedDict(chain.from_iterable(itemgetter(*order)(gb)))
# OrderedDict([('dog_1', 2), ('dog_2', 5), ('cat_1', 1), ('cat_2', 3), ('fish_1', 4)])
Upvotes: 0
Reputation: 14216
Here is another approach using regex
and partial
functions.
import re
from operator import itemgetter
from functools import partial
first = itemgetter(0)
pattern = '|'.join(order) # 'dog|cat|fish'
def group(order, pattern, txt):
item = first(txt)
res = re.search(pattern, item)
return order.index(res.group(0))
p = partial(group, order, pattern)
OrderedDict(sorted(old_OD.items(), key=p))
OrderedDict([('dog_1', 2),
('dog_2', 5),
('cat_1', 1),
('cat_2', 3),
('fish_1', 4)])
Upvotes: 0