Reputation: 13041
I want find all possible ways to map a series of (contiguous) integers M = {0,1,2,...,m} to another series of integers N = {0,1,2,...,n} where m > n, subject to the constraint that only contiguous integers in M map to the same integer in N.
The following piece of python code comes close (start
corresponds to the first element in M, stop
-1 corresponds to the last element in M, and nbins
corresponds to |N|):
import itertools
def find_bins(start, stop, nbins):
if (nbins > 1):
return list(list(itertools.product([range(start, ii)], find_bins(ii, stop, nbins-1))) for ii in range(start+1, stop-nbins+2))
else:
return [range(start, stop)]
E.g
In [20]: find_bins(start=0, stop=5, nbins=3)
Out[20]:
[[([0], [([1], [2, 3, 4])]),
([0], [([1, 2], [3, 4])]),
([0], [([1, 2, 3], [4])])],
[([0, 1], [([2], [3, 4])]),
([0, 1], [([2, 3], [4])])],
[([0, 1, 2], [([3], [4])])]]
However, as you can see the output is nested, and for the life of me, I cant find a way to properly amend the code without breaking it.
The desired output would look like this:
In [20]: find_bins(start=0, stop=5, nbins=3)
Out[20]:
[[(0), (1), (2, 3, 4)],
[(0), (1, 2), (3, 4)],
[(0), (1, 2, 3), (4)],
[(0, 1), (2), (3, 4)],
[(0, 1), (2, 3), (4)],
[(0, 1, 2), (3), (4)]]
Upvotes: 1
Views: 347
Reputation: 70725
I suggest a different approach: a partitioning into n
non-empty bins is uniquely determined by the n-1
distinct indices marking the boundaries between the bins, where the first marker is after the first element, and the final marker before the last element. itertools.combinations()
can be used directly to generate all such index tuples, and then it's just a matter of using them as slice indices. Like so:
def find_nbins(start, stop, nbins):
from itertools import combinations
base = range(start, stop)
nbase = len(base)
for ixs in combinations(range(1, stop - start), nbins - 1):
yield [tuple(base[lo: hi])
for lo, hi in zip((0,) + ixs, ixs + (nbase,))]
Then, e.g.,
for x in find_nbins(0, 5, 3):
print(x)
displays:
[(0,), (1,), (2, 3, 4)]
[(0,), (1, 2), (3, 4)]
[(0,), (1, 2, 3), (4,)]
[(0, 1), (2,), (3, 4)]
[(0, 1), (2, 3), (4,)]
[(0, 1, 2), (3,), (4,)]
Just noting that there's a more general underlying problem here: generating the ways to break an arbitrary sequence into n
non-empty bins. Then the specific question here is applying that to the sequence range(start, stop)
. I believe viewing it that way makes the code easier to understand, so here it is:
def gbins(seq, nbins):
from itertools import combinations
base = tuple(seq)
nbase = len(base)
for ixs in combinations(range(1, nbase), nbins - 1):
yield [base[lo: hi]
for lo, hi in zip((0,) + ixs, ixs + (nbase,))]
def find_nbins(start, stop, nbins):
return gbins(range(start, stop), nbins)
Upvotes: 2
Reputation: 13041
This does what I want; I will gladly accept simpler, more elegant solutions:
def _split(start, stop, nbins):
if (nbins > 1):
out = []
for ii in range(start+1, stop-nbins+2):
iterator = itertools.product([range(start, ii)], _split(ii, stop, nbins-1))
for item in iterator:
out.append(item)
return out
else:
return [range(start, stop)]
def _unpack(nested):
unpacked = []
if isinstance(nested, (list, tuple)):
for item in nested:
if isinstance(item, tuple):
for subitem in item:
unpacked.extend(_unpack(subitem))
elif isinstance(item, list):
unpacked.append([_unpack(subitem) for subitem in item])
elif isinstance(item, int):
unpacked.append([item])
return unpacked
else: # integer
return nested
def find_nbins(start, stop, nbins):
nested = _split(start, stop, nbins)
unpacked = [_unpack(item) for item in nested]
return unpacked
Upvotes: 0