teeeeee
teeeeee

Reputation: 737

Faster way to transpose each row of a matrix and multiply the resulting vector by some other matrix?

I have an input matrix X with dimensions N_rows x N_cols. I also have a sparse, tridiagonal matrix M which is square of size N_rows x N_rows. These are created as follows:

N_rows = 3;
N_cols = 6;
X = rand(N_rows,N_cols);

mm = 10*ones(N_cols,1); % Subdiagonal elements
dd = 20*ones(N_cols,1); % Main diagonal elements
pp = 30*ones(N_cols,1); % Superdiagonal elements
M = spdiags([mm dd pp],-1:1,N_cols,N_cols);

and look like the following:

>> X

X =

    0.4018    0.1233    0.4173    0.9448    0.3377    0.1112
    0.0760    0.1839    0.0497    0.4909    0.9001    0.7803
    0.2399    0.2400    0.9027    0.4893    0.3692    0.3897

full(M)

ans =

     2     3     0     0     0     0
     1     2     3     0     0     0
     0     1     2     3     0     0
     0     0     1     2     3     0
     0     0     0     1     2     3
     0     0     0     0     1     2

I would like to take each row of X, and do a matrix multiplication with M, and piece the obtained rows back together to obtain an output Y. At the moment, I achieve this successfully with the following:

Y = (M*X.').';

The example above is for a 3x6 matrix for X, but in reality I need to do this for a matrix with dimensions 500 x 500, about 10000 times, and the profiler says that this operation in the bottleneck in my larger code. Is there a faster way to do this row-by-row matrix multiplication multiplication?

On my system, the following takes around 20 seconds to do this 10000 times:

N_rows = 500;
N_cols = 500;
X = rand(N_rows,N_cols);

mm = 10*ones(N_cols,1); % Subdiagonal elements
dd = 20*ones(N_cols,1); % Main diagonal elements
pp = 30*ones(N_cols,1); % Superdiagonal elements
M = spdiags([mm dd pp],-1:1,N_cols,N_cols);

tic
for k = 1:10000
Y = (M*X.').';
end
toc

Elapsed time is 18.632922 seconds.

Upvotes: 2

Views: 328

Answers (2)

rahnema1
rahnema1

Reputation: 15837

Another option is using conv2:

Y = conv2(X, [30 20 10], 'same');

Explanation:
There is a tridiagonal matrix that all elements on each diagonal are identical to each other:

M = 
 2     3     0     0     0     0
 1     2     3     0     0     0
 0     1     2     3     0     0
 0     0     1     2     3     0
 0     0     0     1     2     3
 0     0     0     0     1     2

Suppose you want to multiply the matrix by a vector:

V = [11 ;12 ;13 ;14 ;15 ;16];
R = M * V;

Each element of the vector R is computed by sum of products of each row of M by V:

R(1):
 2     3     0     0     0     0
 11    12    13    14    15    16
R(2):
 1     2     3     0     0     0
 11    12    13    14    15    16
R(3):
 0     1     2     3     0     0
 11    12    13    14    15    16
R(4):
 0     0     1     2     3     0
 11    12    13    14    15    16
R(5):
 0     0     0     1     2     3
 11    12    13    14    15    16
R(6):
 0     0     0     0     1     2
 11    12    13    14    15    16

It is the same as multiplying a sliding window of [1 2 3] by each row of M. Basically convolution applies a sliding window but first it reverses the direction of window so we need to provide the sliding window in the reversed order to get the correct result. Because of that I used Y = conv2(X, [30 20 10], 'same'); instead of Y = conv2(X, [10 20 30], 'same');.

Upvotes: 2

Luis Mendo
Luis Mendo

Reputation: 112689

You can use X*M.' instead of (M*X.').';. This saves around 35% of time on my computer.

This can be explained because transposing (or permuting dimensions) implies rearranging the elements in the internal (linear-order) representation of the matrix, which takes time.

Upvotes: 4

Related Questions