Reputation: 9

How to create a macro in SAS to repeat PROC MIXED in a large data set

I am currently working with a data set of 760 metabolites. These metabolites were provided to 15 bacterial species. Further, their growth was monitored at 2 optical densities (OD), in triplicate. Therefore, I have a data set with 1520 rows and 17 columns.

optins formdlim='.';
data = Vera`                                                                             
input Metabolite$ OD P1 P2  P3  P4  P5  P6  P7  C8  C9  C10 B11 B12 B13 B14 B15
;

cards;

proc print; run;

What I want to do is to find out whether the data of the 2 optical densities is the same (no significant difference between the data at 492 and 630 for each metabolite. Therefore, I wrote the following code:

PROC mixed DATA=Vera;
CLASS OD ;
MODEL P1 = OD / SOLUTION ;
lsmeans OD/ diff;
RUN;

With this you can analyse the differences between the ODs at 492 and 630 for each of the bacterial species. However, you would have to separate the data and run each metabolite separately. That would not be a problem with a small number of variables, but I have 760. So I don't want to repeat the procedure and input manually the code 760 times. I want to write a macro to use in SAS and repeat the same syntax for each of the metabolites. How can I do that?

My data looks like this:

Metabolite  OD  P1           P2          P3          P4          P5          P6          P7         C8          C9           C10         B11        B12         B13         B14         B15
C1         492  0.80318008  0.834511094 0.755462174 0.947215787 0.887920107 0.941135272 0.854403285 0.827162124 0.774818623 1.043873527 0.980611933 0.99175232  0.899985465 2.323935576 0.989680725
C1         492  1.015295591 0.937931127 0.862409875 1.035489644 1.020100969 1.432972263 1.20098598  1.014347313 1.024901914 1.350518389 1.228546301 1.058456868 1.021602321 0.882652756 1.068231275
C1         492  0.810476853 0.767190317 0.566538969 1.160767653 1.036374265 1.007790833 1.190486783 1.113972414 0.325186332 0.907718954 1.675218213 0.906072763 1.410147143 1.060946843 1.067602052
C1         630  0.961524961 1.005846657 0.847824375 1.025462906 0.976906071 0.976627864 1.01474825  0.903212955 0.934967536 0.882814468 1.001740347 0.903248894 0.996416257 1.02681187  0.916566129
C1         630  1.554650956 0.737506567 2.452827299 1.037786536 0.874060377 0.950382623 1.081525591 2.143129784 1.077641166 1.993884723 1.685291793 0.927601975 1.097186964 0.84841252  0.942020551
C1         630  3.397638555 3.48494389  2.736307131 4.485634181 4.927877673 4.754434301 5.041446678 3.008039216 1.24514729  3.849372819 3.335763153 4.537001962 4.347699905 2.650736885 5.007861571
C2         492  0.621121776 0.655197791 0.624464533 0.774748488 0.835036637 0.890241965 1.050214203 0.766379479 0.499753317 0.708279952 0.851083004 0.833468896 0.842360044 0.536406298 0.722104984
C2         492  1.75496053  1.625140448 1.234260466 1.600459563 1.805650674 3.902582698 4.366733197 4.3322092   0.884777351 3.659221055 3.698372956 4.424445968 3.911657965 1.184654064 3.032617686
C2         492  1.136163306 0.990741638 1.008046619 1.090941503 1.065424996 1.286243284 1.162517672 1.086776372 1.050708989 0.947436205 1.255244694 1.097283143 1.064965485 1.025620139 0.974254224
C2         630  1.113004223 1.481277257 1.117820203 1.606865598 1.547740666 1.923981394 1.79028251  1.600927099 0.651330519 1.688562315 1.671669463 1.596206391 1.999786168 1.112853138 1.95607287
C2         630  0.802575958 0.63027506  0.688188658 0.879770793 0.779821048 0.884177322 0.942509034 0.755849107 0.630951119 0.712527463 0.897567203 0.847457282 0.838313324 0.696858072 0.737402398
C2         630  3.868652818 3.623364192 2.899296194 4.850127834 5.171682933 5.239876518 5.407341626 3.381502495 1.345204779 4.170354345 3.676830466 4.893081332 4.646074976 2.792233812 5.15275719

I tried creating a macro but I think I am not getting what I would like to see. Again, I have 16 bacterial species, which were grown in presence of 760 metabolites. The growth was measured at 2 different OD. I want to find out whether the OD measurement for each metabolite on each species is significantly different. I modified the macro and the model as follows:

options formdlim='-';
data vera;
input Metabolite$ OD P1 P2 P3 P4 P5 P6 P7 C8 C9 C10 B11 B12 B13 B14 B15 B16;
cards;
%macro metabolites(varsel); 

PROC mixed DATA=Vera;
CLASS OD ;
MODEL &varsel = OD / ddfm=kr ;
lsmeans OD/pdiff;
RUN;
%mend 
%metabolites (P1); 
%metabolites (P2);
%metabolites (P3); 
%metabolites (P4); 
%metabolites (P5); 
%metabolites (P6); 
%metabolites (P7); 
%metabolites (C8);
%metabolites (C9); 
%metabolites (C10); 
%metabolites (B11); 
%metabolites (B12); 
%metabolites (B13); 
%metabolites (B14); 
%metabolites (B15); 
%metabolites (B16);

With this model I can see that all the metabolites measured in a particular species at a specific OD (492 nm, for example) are not significantly different from ALL the metabolites measured at 630 nm. Unfortunately, this does not have biological relevance and I am still needing to repeat the syntax every time that I want to find significant differences among ODs in a particular metabolite in a particular species.

I also tried sorting out the data set with the "BY" statement but I did not get any different output. Is there anything else that I am missing?

Upvotes: 0

Answers (2)

Reeza

Reputation: 21294

Change your data structure so you can use the BY statement as mentioned above. I haven't tested it but something like the following should work:

*flip your data;
data flipped;
set vera;
array bs(15) P1-P7 C8-C10 B11-B15;
do i = 1 to dim(bs);
    BS = vname(bs(i)); *capture the name of the variable;
    BS_Value = bs(i);
    output;
end;
run;

proc sort data=flipped; 
 by metabolite bs;
run;

PROC mixed DATA=Vera;
By metabolite bs;
CLASS OD ;
MODEL BS_value = OD / SOLUTION ;
lsmeans OD/ diff;
RUN;

Upvotes: 0

mlegge

Reputation: 6913

Just specify a BY statement:

PROC mixed DATA=Vera;
  CLASS OD ;
  MODEL P1 = OD / SOLUTION ;
  BY Metabolite;
  lsmeans OD/ diff;
RUN;

Knowledge base reference here.

Upvotes: 3

How to create a macro in SAS to repeat PROC MIXED in a large data set

Answers (2)

Related Questions