Shane
Shane

Reputation: 4983

How to re-arrange list like this (python)?

For example, list to_be consists of: 3 of "a", 4 of "b", 3 of "c", 5 of "d"...

to_be = ["a", "a", "a", "b", "b", "b", "b", "c", "c", "c", "d", "d", "d", "d", "d", ...]

Now I want it to be like this:

done = ["a", "b", "c", "d", ... , "a", "b", "c", "d", ... , "b", "d", ...] (notice: some items are more than others as in amounts, but they need to be still in a pre-defined order, alphabetically for example)

What's the fastest way to do this?

Upvotes: 3

Views: 212

Answers (5)

inspectorG4dget
inspectorG4dget

Reputation: 113915

to_be = ["a", "a", "a", "b", "b", "b", "b", "c", "c", "c", "d", "d", "d", "d", "d"]
counts = collections.Counter(to_be)
answer = []
while counts:
    answer.extend(sorted(counts))
    for k in counts:
        counts[k] -= 1
    counts = {k:v for k,v in counts.iteritems() if v>0}

Now, answer looks like this:

['a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'b', 'd', 'd']

Upvotes: 2

Gareth Latty
Gareth Latty

Reputation: 88977

Presuming I am understanding what you want, it can be done relatively easily by combining itertools.zip_longest, itertools.groupby and itertools.chain.from_iterable():

We first group the items into sets (the "a"s, the "b"s, etc...), we zip them up to get them in the order your want (one from each set), use chain to produce a single list, and then remove the None values introduced by the zipping.

>>> [item for item in itertools.chain.from_iterable(itertools.zip_longest(*[list(x) for _, x in itertools.groupby(to_be)])) if item]
['a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'b', 'd', 'd']

You may want to separate out some of the list comprehensions to make it a bit more readable, however:

>>> groups = itertools.zip_longest(*[list(x) for _, x in itertools.groupby(to_be)])
>>> [item for item in itertools.chain.from_iterable(groups) if item]
['a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'b', 'd', 'd']

(The given version is for 3.x, for 2.x, you will want izip_longest().)

As always, if you expect empty strings, 0, etc... then you will want to do if item is not None, and if you need to keep None values in tact, create a sentinel object and check for identity against that.

You could also use the roundrobin() recipe given in the docs, as an alternative to zipping, which makes it as simple as:

>>> list(roundrobin(*[list(x) for _, x in itertools.groupby(to_be)]))
['a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'b', 'd', 'd']

As a final note, the observant might note me making lists from the groupby() generators, which may seem wasteful, the reason comes from the docs:

The returned group is itself an iterator that shares the underlying iterable with groupby(). Because the source is shared, when the groupby() object is advanced, the previous group is no longer visible. So, if that data is needed later, it should be stored as a list.

Upvotes: 12

jsbueno
jsbueno

Reputation: 110208

Doing it "by hand and state machinne" should be way more efficient - but for relatively small lists (<5000), you should have no problem taking vantage of Python goodies doing this:

to_be = ["a", "a", "a", "b", "b", "b", "b", "c", "c", "c", "d", "d", "d", "d", "d","e", "e"]


def do_it(lst):
    lst = lst[:]
    result = []

    while True:
        group = set(lst)
        result.extend(sorted(group))
        for element in group:
            del lst[lst.index(element)]
        if not lst:
            break
    return result

done = do_it(to_be)

The "big O" complexity of the function above should be really BIG. I had not event ried to figure it out.

Upvotes: 0

Jordan Lewis
Jordan Lewis

Reputation: 17928

A bit less elegant than Lattyware's:

import collections
def rearrange(l):
    counts = collections.Counter(l)
    output = []
    while (sum([v for k,v in counts.items()]) > 0):
        output.extend(sorted([k for k, v in counts.items() if v > 0))
        for k in counts:
            counts[k] = counts[k] - 1 if counts[k] > 0 else 0
    return counts

Upvotes: 0

mgilson
mgilson

Reputation: 309841

I'm not sure if this is fastest, but here's my stab at it:

>>> d = defaultdict(int)
>>> def sort_key(a):
...     d[a] += 1
...     return d[a],a
...

>>> sorted(to_be,key=sort_key)
['a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'b', 'd', 'd']

wrapped up in a function:

def weird_sort(x):
    d = defaultdict(int)
    def sort_key(a):
        d[a] += 1
        return (d[a],a)
    return sorted(x,key=sort_key)

Of course, this requires that the elements in your iterable be hashable.

Upvotes: 1

Related Questions