Zack Fair
Zack Fair

Reputation: 239

Calculate products of columns according to combinations with replacement

The Problem

It's a bit difficult to explain but I will try my best. I know the equation to find the number of combinations with replacement. Let's say I have 6 vectors: A, B, C, D, E, F. If I want to find every possible cubic product of these 6 variables, it would be (6+3-1)!/3!(6-1)! = 56 combinations (see end). Similarly, if I want every quadratic product, it's 21. For just linear, of course 6 (just each variable itself). I want to calculate all 6+21+56 = 83 combinations. I am thinking of 3 loops and each inner loop starts iterating from its outer loop like

for i1=1:6
   X(:,?) = X.*X(:,i1)
   for i2=i1:6
      X(:,?) = X.*X(:,i2)
      for i3=i2:6
         X(:,?) = X.*X(:,i3)

But the index of the 83-column matrix to store all the data in the left-hand side is confusing me. They are marked with question marks as you can see.

PS: Might need to do this with 5th order too so it would add another 126 and 252 columns for a total of 461 columns. So a more generic code is better that doesn't hard-code 3rd order. But if it's hard-coded to 5th that's OK since I am definitely not going above that.

Either MATLAB or Python is fine since I can switch easily between both.

The quadratic combinations calculated with an example

Here is an example of the 21 columns I expect for the quadratic combinations of the 6 variables, A through F. Done in Excel. I have taken 3 samples for each vector. enter image description here

The cubic combinations list

Here are the 56 combinations I need to calculate:

A,A,A

A,A,B

A,A,C

A,A,D

A,A,E

A,A,F

A,B,B

A,B,C

A,B,D

A,B,E

A,B,F

A,C,C

A,C,D

A,C,E

A,C,F

A,D,D

A,D,E

A,D,F

A,E,E

A,E,F

A,F,F

B,B,B

B,B,C

B,B,D

B,B,E

B,B,F

B,C,C

B,C,D

B,C,E

B,C,F

B,D,D

B,D,E

B,D,F

B,E,E

B,E,F

B,F,F

C,C,C

C,C,D

C,C,E

C,C,F

C,D,D

C,D,E

C,D,F

C,E,E

C,E,F

C,F,F

D,D,D

D,D,E

D,D,F

D,E,E

D,E,F

D,F,F

E,E,E

E,E,F

E,F,F

F,F,F

Upvotes: 4

Views: 152

Answers (3)

smichr
smichr

Reputation: 19125

Your AAA, AAB, AAC, etc are (as your title indicates) combinations with replacement of the elements A - F. Here is a Python example with just 3 elements:

>>> import itertools
>>> [''.join(i) for i in itertools.combinations_with_replacement('abc', 3)]
['aaa', 'aab', 'aac', 'abb', 'abc', 'acc', 'bbb', 'bbc', 'bcc', 'ccc']

Upvotes: 0

bousof
bousof

Reputation: 1251

You can avoid confusion of indexing by using a counter:

clear all; close all

% Original matrix
M = [
   2 2 3 2 8 8;
   5 1 7 9 4 4;
   4 1 2 7 2 9
];

% Number of combinations
order = 3;
sizeX = nchoosek(size(M,2)+order-1,order);

% Combinations
imat = ones(sizeX,order);
for c=2:sizeX
    imat(c,:) = imat(c-1,:);
    for o=order:-1:1
        if (imat(c-1,o)<size(M,2))
            imat(c,o:end) = imat(c-1,o)+1;
            break
        end
    end
end

% Transpose & display combinations
imat = transpose(imat)

% Computations of products
X = ones(size(M,1),sizeX);
for o=1:order
    X = X.*M(:,imat(o,:));
end

% Display result
X

When you execute the script you get:

>> test_script
imat =
  Columns 1 through 16
     1     1     1     1     1     1     1     1     1     1     1     1     1     1     1     1
     1     1     1     1     1     1     2     2     2     2     2     3     3     3     3     4
     1     2     3     4     5     6     2     3     4     5     6     3     4     5     6     4
  Columns 17 through 32
     1     1     1     1     1     2     2     2     2     2     2     2     2     2     2     2
     4     4     5     5     6     2     2     2     2     2     3     3     3     3     4     4
     5     6     5     6     6     2     3     4     5     6     3     4     5     6     4     5
  Columns 33 through 48
     2     2     2     2     3     3     3     3     3     3     3     3     3     3     4     4
     4     5     5     6     3     3     3     3     4     4     4     5     5     6     4     4
     6     5     6     6     3     4     5     6     4     5     6     5     6     6     4     5
  Columns 49 through 56
     4     4     4     4     5     5     5     6
     4     5     5     6     5     5     6     6
     6     5     6     6     5     6     6     6

X =
  Columns 1 through 16
     8     8    12     8    32    32     8    12     8    32    32    18    12    48    48     8
   125    25   175   225   100   100     5    35    45    20    20   245   315   140   140   405
    64    16    32   112    32   144     4     8    28     8    36    16    56    16    72   196
  Columns 17 through 32
    32    32   128   128   128     8    12     8    32    32    18    12    48    48     8    32
   180   180    80    80    80     1     7     9     4     4    49    63    28    28    81    36
    56   252    16    72   324     1     2     7     2     9     4    14     4    18    49    14
  Columns 33 through 48
    32   128   128   128    27    18    72    72    12    48    48   192   192   192     8    32
    36    16    16    16   343   441   196   196   567   252   252   112   112   112   729   324
    63     4    18    81     8    28     8    36    98    28   126     8    36   162   343    98
  Columns 49 through 56
    32   128   128   128   512   512   512   512
   324   144   144   144    64    64    64    64
   441    28   126   567     8    36   162   729

I tested it for order=4 and it should work.

Upvotes: 1

Luis Mendo
Luis Mendo

Reputation: 112749

This is a vectorized approach in Matlab. It should be fast, but is not memory-efficient, because it generates all Cartesian tuples of coumn indices, and then only keeps those that are non-decreasing.

x = [2 2 3 2 8 8; 5 1 7 9 4 4; 4 1 2 7 2 9]; % data
P = 2; % product order
ind = cell(1,P);
[ind{end:-1:1}] = ndgrid(1:size(x,2)); % Cartesian power of column indices with order P
ind = reshape(cat(P+1, ind{:}), [], P); % 2D array where each Cartesian tuple is a row
ind = ind(all(diff(ind, [], 2)>=0, 2), :); % keep only non-decreasing rows
result = prod(reshape(x(:,ind.'), size(x,1), P, []), 2); % apply index into data. This
% creates an intermediate 3D array. Compute products
result = permute(result, [1 3 2]); % convert to 2D array

Upvotes: 2

Related Questions