Reputation: 261
Consider all lists of length n
where the values are integers from the range 0 to n-1
. I would like to iterate as quickly as possible over every possible such list that has exactly one duplicate. If n = 2
then the possible lists are:
00
11
If n = 3
they are:
001
002
110
112
220
221
100
200
011
211
022
122
010
020
101
121
202
212
The total number of such lists is n! * (n choose 2)
so I would like to avoid storing them all in memory if at all possible.
For each list I want to call a function that calculates a function of the list.
What's a good way to do this?
Benchmarks On my computer I get the following benchmark results
timeit all(one_duplicate(6))
100 loops, best of 3: 6.87 ms per loops
timeit all(one_duplicate(7))
10 loops, best of 3: 59 ms per loop
timeit all(one_duplicate(8))
1 loops, best of 3: 690 ms per loop
timeit all(one_duplicate(9))
1 loops, best of 3: 7.58 s per loop
and
timeit all(duplicates(range(6)))
10 loops, best of 3: 22.8 ms per loop
timeit all(duplicates(range(7)))
1 loops, best of 3: 416 ms per loop
timeit all(duplicates(range(8)))
1 loops, best of 3: 9.78 s per loop
and
timeit all(all_doublet_tuples(6))
100 loops, best of 3: 6.36 ms per loop
timeit all(all_doublet_tuples(7))
10 loops, best of 3: 60 ms per loop
timeit all(all_doublet_tuples(8))
1 loops, best of 3: 542 ms per loop
timeit all(all_doublet_tuples(9))
1 loops, best of 3: 7.27 s per loop
I declare all_doublet_tuples
and one_duplicate
to be equal first at the moment (allowing for noise in the tests).
Upvotes: 4
Views: 184
Reputation: 3695
import itertools
n=3
for i in range(n-1):
for j in range(n-i-1):
for perm in map(list, itertools.permutations(range(n), n-1)):
perm.insert(i, perm[i+j])
print perm
key concept here is to generate all N-1 length permutations from a N-length set and then just run it over all repetition combinations (i and i+j)
In case of N=3, permutations are:
(0, 1) (0, 2) (1, 0) (1, 2) (2, 0) (2, 1)
and repeated elements can be at positions marked by X:
XX_, X_X, _XX
Your results is the 'multiplication' of those sets:
XX_ X_X _XX
(0, 1) 001 010 011
(0, 2) 002 020 022
(1, 0) 110 101 100
(1, 2) 112 121 122
(2, 0) 220 202 200
(2, 1) 221 212 211
Everything is computed on the fly. I guess memory consumption is O(N).
Upvotes: 2
Reputation: 9584
from itertools import product
def duplicates(r):
r_len = len(r)
dup_len = r_len - 1
for tup in product(r, repeat=r_len):
if len(set(tup)) == dup_len:
yield tup
>>> list(duplicates(range(2)))
[(0, 0), (1, 1)]
>>> list(duplicates(range(3)))
[(0, 0, 1),
(0, 0, 2),
(0, 1, 0),
(0, 1, 1),
(0, 2, 0),
(0, 2, 2),
(1, 0, 0),
(1, 0, 1),
(1, 1, 0),
(1, 1, 2),
(1, 2, 1),
(1, 2, 2),
(2, 0, 0),
(2, 0, 2),
(2, 1, 1),
(2, 1, 2),
(2, 2, 0),
(2, 2, 1)]
Some benchmarks on my machine:
%timeit all(duplicates(range(5)))
100 loops, best of 3: 2.5 ms per loop
%timeit all(duplicates(range(6)))
10 loops, best of 3: 38.6 ms per loop
%timeit all(duplicates(range(7)))
1 loops, best of 3: 766 ms per loop
%timeit all(duplicates(range(8)))
1 loops, best of 3: 18.8 s per loop
Upvotes: 1
Reputation: 15345
Think of any of the desired output tuples in this way: It was derived by duplicating the first (lets say from the left) doublet elem and inserting it at the position of the second doublet.
Therefore my solution is basically this two-step process:
n-1
permutations of range(n)
import itertools as it
# add_doublet accomplishes step 2
def add_doublet(t):
for i in range(len(t)):
for j in range(i+1, len(t)+1):
yield t[:j] + (t[i],) + t[j:]
def all_doublet_tuples(n):
for unique_tuple in it.permutations(range(n), n-1):
for doublet_tuple in add_doublet(unique_tuple):
yield doublet_tuple
from pprint import pprint
n = 3
pprint(list(all_doublet_tuples(n)))
I do not print the output here cause its kind of lenghty ... and the case for n=3 is boring.
Upvotes: 5
Reputation: 208395
import itertools
def one_duplicate(n):
nrange = list(range(n))
for x in nrange:
without_x = nrange[:x] + nrange[x+1:]
for i, j in itertools.combinations(nrange, 2):
for others in map(list, itertools.permutations(without_x, n-2)):
others.insert(i, x)
others.insert(j, x)
yield others
>>> list(one_duplicate(2))
[[0, 0], [1, 1]]
>>> list(one_duplicate(3))
[[0, 0, 1], [0, 0, 2], [0, 1, 0], [0, 2, 0], [1, 0, 0], [2, 0, 0], [1, 1, 0], [1, 1, 2], [1, 0, 1], [1, 2, 1], [0, 1, 1], [2, 1, 1], [2, 2, 0], [2, 2, 1], [2, 0, 2], [2, 1, 2], [0, 2, 2], [1, 2, 2]]
>>> list(one_duplicate(4))
[[0, 0, 1, 2], [0, 0, 1, 3], [0, 0, 2, 1], [0, 0, 2, 3], [0, 0, 3, 1], [0, 0, 3, 2], [0, 1, 0, 2], [0, 1, 0, 3], [0, 2, 0, 1], [0, 2, 0, 3], [0, 3, 0, 1], [0, 3, 0, 2], [0, 1, 2, 0], [0, 1, 3, 0], [0, 2, 1, 0], [0, 2, 3, 0], [0, 3, 1, 0], [0, 3, 2, 0], [1, 0, 0, 2], [1, 0, 0, 3], [2, 0, 0, 1], [2, 0, 0, 3], [3, 0, 0, 1], [3, 0, 0, 2], [1, 0, 2, 0], [1, 0, 3, 0], [2, 0, 1, 0], [2, 0, 3, 0], [3, 0, 1, 0], [3, 0, 2, 0], [1, 2, 0, 0], [1, 3, 0, 0], [2, 1, 0, 0], [2, 3, 0, 0], [3, 1, 0, 0], [3, 2, 0, 0], [1, 1, 0, 2], [1, 1, 0, 3], [1, 1, 2, 0], [1, 1, 2, 3], [1, 1, 3, 0], [1, 1, 3, 2], [1, 0, 1, 2], [1, 0, 1, 3], [1, 2, 1, 0], [1, 2, 1, 3], [1, 3, 1, 0], [1, 3, 1, 2], [1, 0, 2, 1], [1, 0, 3, 1], [1, 2, 0, 1], [1, 2, 3, 1], [1, 3, 0, 1], [1, 3, 2, 1], [0, 1, 1, 2], [0, 1, 1, 3], [2, 1, 1, 0], [2, 1, 1, 3], [3, 1, 1, 0], [3, 1, 1, 2], [0, 1, 2, 1], [0, 1, 3, 1], [2, 1, 0, 1], [2, 1, 3, 1], [3, 1, 0, 1], [3, 1, 2, 1], [0, 2, 1, 1], [0, 3, 1, 1], [2, 0, 1, 1], [2, 3, 1, 1], [3, 0, 1, 1], [3, 2, 1, 1], [2, 2, 0, 1], [2, 2, 0, 3], [2, 2, 1, 0], [2, 2, 1, 3], [2, 2, 3, 0], [2, 2, 3, 1], [2, 0, 2, 1], [2, 0, 2, 3], [2, 1, 2, 0], [2, 1, 2, 3], [2, 3, 2, 0], [2, 3, 2, 1], [2, 0, 1, 2], [2, 0, 3, 2], [2, 1, 0, 2], [2, 1, 3, 2], [2, 3, 0, 2], [2, 3, 1, 2], [0, 2, 2, 1], [0, 2, 2, 3], [1, 2, 2, 0], [1, 2, 2, 3], [3, 2, 2, 0], [3, 2, 2, 1], [0, 2, 1, 2], [0, 2, 3, 2], [1, 2, 0, 2], [1, 2, 3, 2], [3, 2, 0, 2], [3, 2, 1, 2], [0, 1, 2, 2], [0, 3, 2, 2], [1, 0, 2, 2], [1, 3, 2, 2], [3, 0, 2, 2], [3, 1, 2, 2], [3, 3, 0, 1], [3, 3, 0, 2], [3, 3, 1, 0], [3, 3, 1, 2], [3, 3, 2, 0], [3, 3, 2, 1], [3, 0, 3, 1], [3, 0, 3, 2], [3, 1, 3, 0], [3, 1, 3, 2], [3, 2, 3, 0], [3, 2, 3, 1], [3, 0, 1, 3], [3, 0, 2, 3], [3, 1, 0, 3], [3, 1, 2, 3], [3, 2, 0, 3], [3, 2, 1, 3], [0, 3, 3, 1], [0, 3, 3, 2], [1, 3, 3, 0], [1, 3, 3, 2], [2, 3, 3, 0], [2, 3, 3, 1], [0, 3, 1, 3], [0, 3, 2, 3], [1, 3, 0, 3], [1, 3, 2, 3], [2, 3, 0, 3], [2, 3, 1, 3], [0, 1, 3, 3], [0, 2, 3, 3], [1, 0, 3, 3], [1, 2, 3, 3], [2, 0, 3, 3], [2, 1, 3, 3]]
Here is a description of the algorithm:
itertools.combinations(nrange, 2)
x
in the range, get all permutations of length n-2
for the entire range except for x
x
at each of the indices found in the first step and yield the result. Timing comparisons between my answer and Peter Stahl's:
# Just showing they both get the same result
In [23]: set(peter(range(6))) == set(tuple(x) for x in fj(6))
Out[23]: True
In [24]: %timeit list(peter(range(6)))
10 loops, best of 3: 27.3 ms per loop
In [25]: %timeit list(fj(6))
100 loops, best of 3: 8.32 ms per loop
In [26]: %timeit list(peter(range(7)))
1 loops, best of 3: 506 ms per loop
In [27]: %timeit list(fj(7))
10 loops, best of 3: 81 ms per loop
Upvotes: 3
Reputation: 365587
Your questions are about how to do so without creating an actual list
in memory, and how to run a function on each one, so I'll answer that part generically first, then for your specific problem.
The total number of possible lists is n*(n-1) * (n choose 2) so I would like to avoid storing them all in memory if at all possible.
You just need to create a generator function, which yield
s answers one by one, instead of a regular function that appends each answer to a list
and then returns that list
.
For each list I want to call a function that calculates a function of the list. What's a good way to do this?
You can call a generator function's and use the result as an iterator. For example:
for element in my_generator_function(n):
my_function(element)
If you want the results of calling my_function
on each element, you can build another generator function:
def apply_to(my_function, my_iterator):
for element in my_iterator):
yield my_function(element)
However, there's already a built-in function that does exactly that, called map
in Python 3, or itertools.imap
in Python 2.
Or, even more simply, you can use a generator expression:
results = (my_function(element) for element in my_generator_function(n))
For more, see the Iterators, Generators, and Generator Expressions section in the official Python tutorial.
Now, you can break your problem down like this:
For n=2, this means all permutations of (0, 0) together with 0 numbers from (1), and all permutations of (1, 1) together with 0 numbers from (0). For n=3, you've got all permutations of (0, 0) together with 1 of the numbers (1, 2), and all permutations of (1, 1) together with 1 of the numbers (0, 2), and all permutations of (2, 2) together with 1 of the numbers (0, 1). And so on.
So, translating that directly to Python, using the magic of itertools
:
def one_duplicate_element(n):
for i in range(n):
rest = (j for j in range(n) if j != i)
for combination in itertools.combinations(rest, n-2):
yield from itertools.permutations(combination + (i, i))
This uses the yield from
statement in Python 3.3, which means "yield every element from this iterator". If you're using 3.2 or earlier, you need to do that explicitly with a loop:
def one_duplicate_element(n):
for i in range(n):
rest = (j for j in range(n) if j != i)
for combination in itertools.combinations(rest, n-2):
for permutation in itertools.permutations(combination + (i, i)):
yield permutation
But you'll notice that, while this does exactly what you asked for, it doesn't generate the same list as your sample output. Even ignoring ordering (which I don't think is relevant for you), there are duplicates. Why?
Well, what are the permutations of the set (0, 0)
? That's a trick question; (0, 0)
can't be a set, because it has duplicates in it. So, whatever (0, 0)
is, what are its permutations? Well, by the usual definition of permutations on ordered tuples, what would be ((0, 0), (0, 0))
. And that's exactly what itertools.permutations
gives you.
You can look at how permutations
works and write a permutations_without_duplicates
function yourself, or you can use the unique_everseen
function in the Recipes section of the docs, or you can sort the results and use the unique_justseen
, or… Well, let's pick one:
def one_duplicate_element_no_duplicates(n):
yield from unique_everseen(one_duplicate_element(n))
Upvotes: 1