Reputation: 21
The itertools.combinations function of the CPython's standard library doesn't have an $O(1)$ amortized time complexity. When generating $k$-combinations of $n$ elements, the main issue arises from when $k$ gets close to $n$. We end up with an $O(n)$ amortized time complexity.
Quick details of why i ended up with amortized time complexity of $O(n)$ : In that code, the number of combinations ensuring the pivot index is $i$ is $\binom{n-k+i}{i+1}$ and the number of array operations for such an instance is $3k-3i-1$. By multipliying both quantities and summing over all possible values of $i$, we obtain the total number of operations. Dividing this by $\binom{n}{k}$ the number of combinations, we end up with the amortized time complexity of $O(n)$.
The following self-contained C code is a rewrite of the CPython itertools.combinations function made for handling k-combinations of an array of integers between 0 and n-1, in an iterator-type style.
#include <stdlib.h> // malloc
#include <stdio.h> // printf
#include <stdbool.h> //bool, true, false
/* TESTING */
int* integersInOrder(int size){
int* array = malloc(size * sizeof(int));
for(int i=0; i < size; i++)
array[i] = i;
return array;
}
void printArray(int* array, int size){
for(int i=0; i<size; i++)
printf("%d ", (int)array[i]);
printf("\n");
}
/* itertools.combinations */
static const bool IS_RUNNING = true, IS_DONE = false;
bool nextCombination(int* indices, int n, int k){
if(k > n)
return IS_DONE;
bool isFound = false;
int i;
// find furthest-right non-maximal index
for(i = k - 1; i >= 0; i--){
if(indices[i] != n - k + i){
isFound = true;
break;
}
}
if(!isFound)
return IS_DONE;
// increase index i by 1
indices[i]++;
// set all indices to the right of i to be the lexicographical minimum values
for(int j = i + 1; j < k; j++)
indices[j] = indices[j - 1] + 1;
return IS_RUNNING;
}
/* ENTRY POINT */
int main()
{
int n = 5, k = 3;
int* array = integersInOrder(k);
printArray(array, k);
while(IS_RUNNING == nextCombination(array, n, k)){
printArray(array, k);
}
return 0;
}
Upvotes: 2
Views: 113
Reputation: 10477
The problem with your solution is that you modify array
in place. If CPython were to adapt your implementation, the result of list(itertools.combinations(range(4), 3))
would be something like [[1, 2, 3], [1, 2, 3], [1, 2, 3]]
, which would be surprising to users. You would need to make a copy of the first k entries in array
and yield that, which will take O(k) at the minimum, which, as k approaches n, is O(n).
You can't escape these copies without breaking user's expectations.
Upvotes: 1