Reputation: 4983
For example, list to_be
consists of: 3 of "a"
, 4 of "b"
, 3 of "c"
, 5 of "d"
...
to_be = ["a", "a", "a", "b", "b", "b", "b", "c", "c", "c", "d", "d", "d", "d", "d", ...]
Now I want it to be like this:
done = ["a", "b", "c", "d", ... , "a", "b", "c", "d", ... , "b", "d", ...] (notice: some items are more than others as in amounts, but they need to be still in a pre-defined order, alphabetically for example)
What's the fastest way to do this?
Upvotes: 3
Views: 212
Reputation: 113915
to_be = ["a", "a", "a", "b", "b", "b", "b", "c", "c", "c", "d", "d", "d", "d", "d"]
counts = collections.Counter(to_be)
answer = []
while counts:
answer.extend(sorted(counts))
for k in counts:
counts[k] -= 1
counts = {k:v for k,v in counts.iteritems() if v>0}
Now, answer
looks like this:
['a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'b', 'd', 'd']
Upvotes: 2
Reputation: 88977
Presuming I am understanding what you want, it can be done relatively easily by combining itertools.zip_longest
, itertools.groupby
and itertools.chain.from_iterable()
:
We first group the items into sets (the "a"
s, the "b"
s, etc...), we zip them up to get them in the order your want (one from each set), use chain to produce a single list, and then remove the None
values introduced by the zipping.
>>> [item for item in itertools.chain.from_iterable(itertools.zip_longest(*[list(x) for _, x in itertools.groupby(to_be)])) if item]
['a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'b', 'd', 'd']
You may want to separate out some of the list comprehensions to make it a bit more readable, however:
>>> groups = itertools.zip_longest(*[list(x) for _, x in itertools.groupby(to_be)])
>>> [item for item in itertools.chain.from_iterable(groups) if item]
['a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'b', 'd', 'd']
(The given version is for 3.x, for 2.x, you will want izip_longest()
.)
As always, if you expect empty strings, 0, etc... then you will want to do if item is not None
, and if you need to keep None
values in tact, create a sentinel object and check for identity against that.
You could also use the roundrobin()
recipe given in the docs, as an alternative to zipping, which makes it as simple as:
>>> list(roundrobin(*[list(x) for _, x in itertools.groupby(to_be)]))
['a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'b', 'd', 'd']
As a final note, the observant might note me making lists from the groupby()
generators, which may seem wasteful, the reason comes from the docs:
The returned group is itself an iterator that shares the underlying iterable with groupby(). Because the source is shared, when the groupby() object is advanced, the previous group is no longer visible. So, if that data is needed later, it should be stored as a list.
Upvotes: 12
Reputation: 110208
Doing it "by hand and state machinne" should be way more efficient - but for relatively small lists (<5000), you should have no problem taking vantage of Python goodies doing this:
to_be = ["a", "a", "a", "b", "b", "b", "b", "c", "c", "c", "d", "d", "d", "d", "d","e", "e"]
def do_it(lst):
lst = lst[:]
result = []
while True:
group = set(lst)
result.extend(sorted(group))
for element in group:
del lst[lst.index(element)]
if not lst:
break
return result
done = do_it(to_be)
The "big O" complexity of the function above should be really BIG. I had not event ried to figure it out.
Upvotes: 0
Reputation: 17928
A bit less elegant than Lattyware's:
import collections
def rearrange(l):
counts = collections.Counter(l)
output = []
while (sum([v for k,v in counts.items()]) > 0):
output.extend(sorted([k for k, v in counts.items() if v > 0))
for k in counts:
counts[k] = counts[k] - 1 if counts[k] > 0 else 0
return counts
Upvotes: 0
Reputation: 309841
I'm not sure if this is fastest, but here's my stab at it:
>>> d = defaultdict(int)
>>> def sort_key(a):
... d[a] += 1
... return d[a],a
...
>>> sorted(to_be,key=sort_key)
['a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'b', 'd', 'd']
wrapped up in a function:
def weird_sort(x):
d = defaultdict(int)
def sort_key(a):
d[a] += 1
return (d[a],a)
return sorted(x,key=sort_key)
Of course, this requires that the elements in your iterable be hashable.
Upvotes: 1