jayanth a
jayanth a

Reputation: 3

generate all combinations to extend to large columns

I have C columns where I want to fill all combinations of N elements (C>N). How can I do it in Matlab?
Example: C= 5, N=3; then all the combinations of N from 1 to N i.e. 1 to 3 to be filled in C=5 columns. I tried NchooseK but could not achieve this. The expected output looks as below:

Output: 11111, 11112, 11122, 11222, ....., 22222, 22223, 22233, ....., 33333, ..... 12312, 12313, 12323, ...so on ...
In this way all combinations of 1 to N filled upto C columns. My C can be as large as 40, and N can be 10.
Is there any way in Matlab.

Upvotes: 0

Views: 86

Answers (1)

Yvon
Yvon

Reputation: 2983

This is a tree traversal problem. Here is a DFS method:

function s = GetCombinations(C,N)
C = int64(C);
N = int64(N);
assert(C>0 && N>0, 'Invalid input')
s = prtstr(C,N,''); % wrap the iterative function
end

function s = prtstr(C,N,prefix)
s = cell(N^C,1);
if C == 1
    for ii = 1:N
        s{ii} = [prefix, num2str(ii)];
    end
else
    BlockLen = N^(C-1);
    for ii = 1:N
        s((1+(ii-1)*BlockLen) : (ii*BlockLen)) = ....
            prtstr(C-1,N,[prefix, num2str(ii)]);
    end
end
end

Output is a cell vector containing all combination strings.

>> s = GetCombinations(5,3)

s = 

    '11111'
    '11112'
    '11113'
    '11121'
    '11122'
    '11123'
    '11131'
    '11132'
    '11133'
    '11211'
    '11212'
    '11213'
    '11221'
    '11222'
    '11223'
    '11231'
    '11232'
    ....

Edit: Skip duplicate cases. For example, 11112 and 11121 are considered the same and only the first one is listed.

function s = GetCombinations(C,N)
C = int64(C);
N = int64(N);
assert(C>0 && N>0, 'Invalid input')
s = prtstr(C,N,'',1);
s = s(~cellfun(@isempty,s));
end

function s = prtstr(C,N,prefix,startN)
s = cell(N^C,1);
if C == 1
    for ii = startN:N
        s{ii} = [prefix, num2str(ii)];
    end
else
    BlockLen = N^(C-1);
    for ii = startN:N
        s((1+(ii-1)*BlockLen) : (ii*BlockLen)) = ....
            prtstr(C-1,N,[prefix, num2str(ii)], ii);
    end
end
end

>> s = GetCombinations(5, 3)

s = 

    '11111'
    '11112'
    '11113'
    '11122'
    '11123'
    '11133'
    '11222'
    '11223'
    '11233'
    '11333'
    '12222'
    '12223'
    '12233'
    '12333'
    '13333'
    '22222'
    '22223'
    '22233'
    '22333'
    '23333'
    '33333'

Edit: For large C and N, pre-allocating the cell array is not feasible. In fact most of the cells are empty. Which ones are empty can be pre-determined, but this itself is an interesting problem. Yet I just append the cell array on the go instead of pre-allocating it. So a bit slower but it works.

function s = GetCombinations(C,N)
C = int64(C);
N = int64(N);
assert(C>0 && N>0, 'Invalid input')
s = prtstr(C,N,'',1);
s = s(~cellfun(@isempty,s));
end

function s = prtstr(C,N,prefix,startN)
s = cell(0,1);
if C == 1
    nt = N-startN+1;
    t = cell(nt,1);
    for ii = 1:nt
        t{ii} = [prefix, num2str(ii+startN-1)];
    end
    s = t;
else
    for ii = startN:N
         t = prtstr(C-1,N,[prefix, num2str(ii)], ii);
         s = [s;t];
    end
end
end

EDIT: Memory-less version. Requested (C,N) = (20,10) and the result is too large to be held in the memory. One possible solution is to just print out the result without storing it. Below is the code, tested.

function GetCombinations(C,N, fh)
C = int64(C);
N = int64(N);
assert(C>0 && N>0, 'Invalid input')
prtstr(C,N,'',1, fh);
end

function prtstr(C,N,prefix,startN, fh)
if isempty(fh) % assign fh = [] if you do not want text file output
    fh = 1; % print to command window
end
if C == 1
    for ii = 1:N-startN+1
        fprintf(fh, '%s%d\n', prefix, ii+startN-1);
    end
else
    for ii = startN:N
        prtstr(C-1,N,[prefix, num2str(ii)], ii, fh)
    end
end
end

The function takes an additional input argument fh which is a file handle. Because (20,10) has quite a lot of lines of output (10m+ lines and 200MB+), you might want to print the result directly into a text file. To achieve this,

fh = fopen('testout.txt','w');
GetCombinations(20,10,fh)
fclose(fh)

Otherwise, if you really want to print the result into the command window, use

GetCombinations(20,10,[])

Upvotes: 1

Related Questions