George Shuklin
George Shuklin

Reputation: 7907

Sort some list elements in certain positions, leaving other positions unchanged

I have a list of elements, with associated sorting keys

data = [
 {'data': 0},
 {'key': 'foo', 'data': 1},
 {'key': 'bar', 'data': 2},
 {'data': 3},
 {'key': 'foo', 'data': 4},
 {'key': 'bar', 'data': 5},
 {'key': 'bar', 'data': 6},
 {'data': 7}
]

Some elements has sorting key, some does not. I have an external table with sorting for some keys:

ORDER = ['bar', 'foo', 'baz']

Expected result:

[
 {'data': 0},
 {'key': 'bar', 'data': 2},
 {'key': 'bar', 'data': 5},
 {'data': 3},
 {'key': 'bar', 'data': 6},
 {'key': 'foo', 'data': 1},
 {'key': 'foo', 'data': 4},
 {'data': 7}
]

I need to sort the array in the way that only elements with key are moved, and elements with the same key are not moved. (Basically, I have a partial order for elements with requirements to not to swap elements if key comparison does not require so). As far as I can see, existing list.sort/sorted (in Python) does not support this type of ordering.

I strongly suspect this problem was solved many times, but I can't find a proper name to search.

How this type of sorting is called?

Upvotes: 3

Views: 714

Answers (2)

Dani Mesejo
Dani Mesejo

Reputation: 61910

You can do this in O(n) time, if you have few keys, using something like a bucket sort for example:

import pprint
from collections import defaultdict
from itertools import chain

# create buckets
buckets, no_key = defaultdict(list), {}
for i, d in enumerate(data):
    if 'key' in d:
        buckets[d['key']].append(d)
    else:
        no_key[i] = d

# create one sorted list for elements with keys, basically iterate over the buckets in ORDER
chained = list(chain.from_iterable([buckets[key] for key in ORDER]))[::-1]

# combine the two iterables, chained and no_key into the final result
final = [no_key.pop(i) if i in no_key else chained.pop() for i in range(len(data))]
pprint.pprint(final)

Output

[{'data': 0},
 {'data': 2, 'key': 'bar'},
 {'data': 5, 'key': 'bar'},
 {'data': 3},
 {'data': 6, 'key': 'bar'},
 {'data': 1, 'key': 'foo'},
 {'data': 4, 'key': 'foo'},
 {'data': 7}]

Upvotes: 2

cs95
cs95

Reputation: 402713

Create an iterator of sorted records:

it1 = iter(sorted((d for d in data if 'key' in d), 
                  key=lambda d_: ORDER.index(d_['key'])))

Next, write a list comprehension to selectively overwrite records with "key", leaving others unchanged.

res = [d if 'key' not in d else next(it1) for d in data]
print (res)
# [{'data': 0},
#  {'key': 'bar', 'data': 2},
#  {'key': 'bar', 'data': 5},
#  {'data': 3},
#  {'key': 'bar', 'data': 6},
#  {'key': 'foo', 'data': 1},
#  {'key': 'foo', 'data': 4},
#  {'data': 7}]

Some points:

  • it might be worth converting ORDER into a dictionary mapping keys to positions, so the lookup is constant time. That should bring down the complexity of the sorting step.

  • sorted returns a list, which is then converted to an iterator, so the intermediate list cannot be avoided.

Upvotes: 4

Related Questions