itertools.ifilter with IPython Parallel

Question

For some problem [proven to be NP hard] I have no other option but exhaustive search. I have a set of data — for simplicity, S = ['A', 'B', 'C', ... ,'Z'] and want to apply a function f to all subsets of length N < len(S) of this set. I cannot use lists here since binomial coefficients binom(len(S),N) are some billions. But the result of f(x), x∈S is zero for almost all the values of S. Therefore in simple cases all works great with

   from itertools import ifilter, combinations
   answer = list(ifilter(lambda x: f(x) > 0, combinations(S,N)))

But in real life, len(S) ~ 10⁴ and N ~ 10². What I want is to spread the work among CPU engines using ipyparallel. I have a small cluster with a hundred of CPU cores. But I still cannot afford to store combinations as lists, therefore I need something like separate generators.

There are a couple of examples of how to split generator into chunks, but as far as I understand they are still consecutive generators. There is also an idea of @minrk that is related but it performs really bad for some reason.

So the questions are:

is there any way to implement itertools.ifilter directly with ipyparallel? or
is it possible to separate python generator into a set of independent generators (to send them to ipcluster engines independently)?

itertools.ifilter with IPython Parallel

Answers (1)

Related Questions