Zal
Zal

Reputation: 975

Sort list with binary values

If I have a categorical list with only two values, how can I sort so that the values are placed on after another.

Example:

# input list
lst = ['foo', 'bar', 'bar', 'foo', 'bar', 'bar', 'foo', 'foo']

# expected output
['foo', 'bar', 'foo', 'bar', 'foo', 'bar', 'foo', 'bar']

I have a working solution, but I felt like this could be done smarter. I also looked into itertools but could not find anything useful for my problem:

my solution:

foo = [val for val in lst if val == 'foo']
bar = [val for val in lst if val == 'bar']

lst2 = [[x, y] for x, y in zip(foo, bar)]

final_list = [val for l in lst2 for val in l]

print(final_list)
['foo', 'bar', 'foo', 'bar', 'foo', 'bar', 'foo', 'bar']

Note, the lists always have a equal amount of both values

Upvotes: 3

Views: 321

Answers (3)

Thierry Lathuille
Thierry Lathuille

Reputation: 24280

If you want the first value kept first, you can simply make a list of the first and the other, and multiply as necessary:

lst = ['foo', 'bar', 'bar', 'foo', 'bar', 'bar', 'foo', 'foo']

first = lst[0]
second = (set(lst) - {first}).pop()
out = [first, second] * (len(lst)//2) 
print(out)
# ['foo', 'bar', 'foo', 'bar', 'foo', 'bar', 'foo', 'bar']

A different, better way of getting the other value, without unnecessarily iterating the whole list to build the set of two values: we just take the next value that is different from the first one.

# input list
lst = ['foo', 'bar', 'bar', 'foo', 'bar', 'bar', 'foo', 'foo']
first = lst[0]
second = next(value for value in lst if value != first)
out = [first, second] * (len(lst)//2) 
print(out)
# ['foo', 'bar', 'foo', 'bar', 'foo', 'bar', 'foo', 'bar']

Upvotes: 2

RoadRunner
RoadRunner

Reputation: 26325

Instead of iterating lst twice to get foo and bar into separate lists, you could iterate it once and group the values into a collections.defaultdict.

Then you could flatten the zipped values with itertools.chain.from_iterable.

from collections import defaultdict
from itertools import chain

lst = ['foo', 'bar', 'bar', 'foo', 'bar', 'bar', 'foo', 'foo']

d = defaultdict(list)
for item in lst:
    d[item].append(item)
# defaultdict(<class 'list'>, {'foo': ['foo', 'foo', 'foo', 'foo'], 'bar': ['bar', 'bar', 'bar', 'bar']})

print(list(chain.from_iterable(zip(*d.values()))))
# ['foo', 'bar', 'foo', 'bar', 'foo', 'bar', 'foo', 'bar']

We could also count the items with collections.Counter, the multiply the keys as lists by the value counts:

from itertools import chain
from collections import Counter

lst = ["foo", "bar", "bar", "foo", "bar", "bar", "foo", "foo"]

counts = Counter(lst)
# Counter({'foo': 4, 'bar': 4})

print(list(chain.from_iterable(zip(*([k] * v for k, v in counts.items())))))
# ['foo', 'bar', 'foo', 'bar', 'foo', 'bar', 'foo', 'bar']

Upvotes: 2

Xiddoc
Xiddoc

Reputation: 3628

You can use the * operator to multiply the 2 values as strings:

lst = ['foo', 'bar', 'bar', 'foo', 'bar', 'bar', 'foo', 'foo']
print(["foo", "bar"] * int(len(lst) / 2))

# ['foo', 'bar', 'foo', 'bar', 'foo', 'bar', 'foo', 'bar']

How does it work?

len(lst) / 2

First, it takes the length of the array, and divides it by two (since we know there are an equal amount of either item, this will always return a whole number).

int(len(lst) / 2)

Even though it returns a full number, Python turns division into a float automatically, so you must convert it back using the int() operation.

["foo", "bar"] * int(len(lst) / 2)

Finally, Python multiplies the 2 values by half of the needed size (2 * 0.5x = x, so you're left with the needed size for the array).

Upvotes: 1

Related Questions