Mr_and_Mrs_D
Mr_and_Mrs_D

Reputation: 34036

Sort a subset of a python list to have the same relative order as in other list

So having a list say b = [b1, b2, b3] I want to be able to sort a list a in such a way that all bi's that also exist in a have the same relative order as in b - leaving the rest of a's elements alone. So

a = [ b1, x, b3, y, b2] -> [ b1, x, b2, y, b3]
a = [ b1, x, b2, y, b3] -> no change
a = [ b1, x, y, b2]     -> no change
a = [ b3, x, b1, y, b2] -> [ b1, x, b2, y, b3]

b of course may be a tuple or any other ordered structure. What I came up with

bslots = dict((x, a.index(x)) for x in a if x in b)
bslotsSorted = sorted(bslots.keys(), key=lambda y: b.index(y))
indexes = sorted(bslots.values())
for x,y in zip(bslotsSorted, indexes):
  a[y] = x

is clumsy and O(n^2)

Upvotes: 8

Views: 1527

Answers (2)

Mr_and_Mrs_D
Mr_and_Mrs_D

Reputation: 34036

The accepted answer answers the question as asked, but my actual problem was a bit more restrictive - namely I would like to keep items in the same relative positions in a as much as possible. So the accepted answer (and my original attempt) would:

b = [ A, E, B ]
a = [ A, B, C, D, E, Z] -> [ A, E, C, D, B, Z ]

I would like to just "bubble up" the out-of-order items so they lose as few of their ancestors as possible: [ A, B, C, D, E, Z ] -> [ A, C, D, E, B, Z ]. Note that previously E would lose ancestors C and D, while now only loses B as required. Marginally tested:

def reorder(a, b):
    bb = b[:]
    b_in_a = [x for x in a if x in set(b)]
    w = dict((x, i) for i, x in enumerate(a))
    while b_in_a:
        for i, (ordered, current) in enumerate(zip(bb, b_in_a)):
            if ordered != current:
                for j, x in enumerate(b_in_a[i:]):
                    if x == ordered: break
                    to = w[ordered] + 1 + j
                    w = dict((x,i if i < to else i+1) for x,i in w.iteritems())
                    w[x] = to # bubble them up !
                b_in_a.remove(ordered)
                bb = bb[i + 1:]
                b_in_a = b_in_a[i:]
                break
        else:
            break
    aa = a[:]
    a.sort(key=w.__getitem__)
    print aa, '-', b, ' -> ', a

# ['A', 'B', 'C', 'D', 'E'] - ['A', 'E', 'B']  ->  ['A', 'C', 'D', 'E', 'B']
# ['A', 'B', 'C', 'D', 'E', 'F'] - ['A', 'E', 'C', 'B']  ->  ['A', 'D', 'E', 'C', 'B', 'F']

Upvotes: 0

Ashwini Chaudhary
Ashwini Chaudhary

Reputation: 251011

  • First create a dictionary using items from b where the key is the item and value is its index, we will use this to sort the matched items in a later on.

  • Now filter out item from a that are present in that dict, dict provides O(1) lookup.

  • Now sort this list of filtered items and convert it to an iterator.

  • Now loop over a again and for each item check if is present in dict then fetch its value from iterator otherwise use it as is.

def solve(a, b):
    dct = {x: i for i, x in enumerate(b)}
    items_in_a = [x for x in a if x in dct]
    items_in_a.sort(key=dct.get)
    it = iter(items_in_a)
    return [next(it) if x in dct else x for x in a]
...
>>> b = ['b1', 'b2', 'b3']
>>> a = [ 'b1', 'x', 'b3', 'y', 'b2']
>>> solve(a, b)
['b1', 'x', 'b2', 'y', 'b3']
>>> a = [ 'b1', 'x', 'b2', 'y', 'b3']
>>> solve(a, b)
['b1', 'x', 'b2', 'y', 'b3']
>>> a = [ 'b1', 'x', 'y', 'b2']
>>> solve(a, b)
['b1', 'x', 'y', 'b2']
>>> a = [ 'b3', 'x', 'b1', 'y', 'b2']
>>> solve(a, b)
['b1', 'x', 'b2', 'y', 'b3']

Overall time complexity is going to be max of (O(len(a)), O(len(b)), O(items_in_a_length log items_in_a_length).

Upvotes: 5

Related Questions