agstudy
agstudy

Reputation: 121588

Compute matrix product in base sas ( not using IML)

In order to Compute the product of 2 matrices I am using this method:

  1. First I put my matrices in the long format (col,row,value)
  2. I use proc sql to compute the product of 2 matrices.
  3. I use proc transpose to put the result of precedent step in the wide format.

My question is there is simpler method? Or at least how can I simplify my code?

here my code:

/* macro to put a matrix in the long format*/
%macro reshape(in_A = , ou_A= );
    data &ou_A.;
    set &in_A.;
    array arr_A{*} _numeric_;
    row = _n_;
    do col = 1 to dim(arr_A);
        value = arr_A{col};
        output;
    end;
    keep row col value;
    run;
%mend;



%macro prod_mat( in_A = , in_B= ,ou_AB =);
  /* put the matrix in the long format */
    %reshape(in_A=&in_A.,ou_A=lA);
    %reshape(in_A=&in_B.,ou_A=lB);
  /* compute product */
    PROC SQL ;
    CREATE TABLE PAB AS
    SELECT lA.row, lB.col, SUM(lA.value * lB.value)  as value
        FROM lA JOIN lB ON lA.col = lB.row
     GROUP BY lA.row, lB.col;
    QUIT;

   /* reshape the output to the wide format */
    proc transpose data=PAB out=&ou_AB.(DROP=_name_) prefix=x;
        by row ;
        id col;
        var value;
    run;

%mend;


data A ; 
      input x1 x2 x3; 
    datalines ; 
    1 2 3
    3 4 4
    5 6 9
   ; 

data B ; 
      input x1 x2; 
    datalines ; 
    1 2
    3 4 
    4 5
    ; 

%prod_mat(in_A =A,in_B=B,ou_AB=AB)

Upvotes: 2

Views: 4242

Answers (1)

Dmitry Shopin
Dmitry Shopin

Reputation: 1763

Well, here's my variant. It's not that the code itself is shorter then yours, but for big matrices it'll work faster because it avoids using SQL-join with cartesian product of all elements. The main idea - full join (cartesian product) of rows of A and transposed B and then multiplying corresponding columns. E.g. in case of 3x3 and 3x2 matrices, we'll need to:

1) multiply and sum up in each row of the merged dataset column1*column4+column2*column5+column3*column6;

2) repeat it for the second row;

3) output the both values in one row.

%macro prod_mat_merge(in_A =,in_B=,ou_AB=);
/*determine number of rows and columns in the 2nd matrix*/
%let B_id=%sysfunc(open(&in_B));
%let B_rows=%sysfunc(attrn(&B_id,nobs));
%let B_cols=%sysfunc(attrn(&B_id,nvars));
%let rc=%sysfunc(close(&B_id));

/*transpose the 2nd matrix*/
proc transpose data=&in_B out=t&in_B(drop=_:);run;

/*making Cartesian product of the 1st and transposed 2nd matrices*/
data &ou_AB;
    do until(eofA);
        set &in_A end=eofA;
        do i=1 to n;
            set t&in_B nobs=n point=i;
            output;
        end;
    end;
run;

/*multiplication*/
data &ou_AB;
    /*new columns for products, equal to number of columns in the 2nd matrix*/
    array p[&B_cols];
    do j=1 to &B_cols;
        p[j]=0;
        set &ou_AB;
        array col _ALL_;
        /*multiply corresponding pairs of columns*/
        do i=&B_cols+2 to &B_cols+1+&B_rows;
            p[j]+col[i]*col[i+&B_rows];
        end;
    end;
    output;
    keep p:;
run;
%mend prod_mat_merge;

I've tested the both methods multiplying two random matrices 100x100 each. The method with reshaping and SQL-join takes ~1.5 sec, while the method with merging takes ~0.2 sec.

Upvotes: 3

Related Questions