Matt
Matt

Reputation: 171

Getting started with SAS parallel processing using MP Connect

I am trying to understand MPConnect and how I can use it for parallel processing.

As a simple example, I started a session which prints "Hello World!" forever and another which prints "Bye World!". I said "waitfor any" and "rget" and I expect "Bye World!" in the Log because "Hello World!" will go on forever while "Bye World!" has finished. Unfortunately, this doesn't work.

In general, I have great difficulties retrieving output from a remotely submitted task.

option cpucount=4 sascmd="!sascmd" autosignon;

rsubmit task1 wait=no;

    data _null_;
        do while(1);
        put "Hello World!";
        end;

    run;

endrsubmit;


rsubmit task2 wait=no;

    data _null_;
        put "Bye World!";
    run;

endrsubmit;


waitfor _any_;
rget;

signoff task1;
signoff task2;

Upvotes: 2

Views: 681

Answers (2)

FriedEgg
FriedEgg

Reputation: 201

The issue is that, just like you say, TASK1 is set to run forever. Specifically it is the statement

signoff task1;

That is causing your particular issue. As you are saying to the submitting process to wait for the indefinite process to end, and then sign off...

If instead you had

signoff task2;
killtask task1;

You would see that you do collect the log information from TASK2 with the RGET (which the SIGNOFF statement would also collect without the RGET). The information from TASK1 in this case is lost, but with the options mentioned already (LOG="task1.log") you can recover the information separately.

The RGET statement does not wait for everything to complete, in a case like this. It will collect what it can from any tasks that have completed at the time of the request, unless you specifically request the RGET TASK1, in which case it will pause there.

Upvotes: 1

Bendy
Bendy

Reputation: 3576

The problems seems to be that when you have syncronous processes running they are completely disconnected from each other. Even though you're only waiting for the fast task2 to complete before continuing:

rsubmit task2 wait=no;
    data _null_;
        put "Bye World!";
    run;
endrsubmit;

SAS needs task1 to complete as well for the final rget:

rsubmit task1 wait=no;
    data _null_;
        do while(1);
        put "Hello World!";
        end;
    run;
endrsubmit;

I think what is happening is that SAS task2 satisfies the waitfor _any_ condition and is able to carry on processing after task2. However the rget needs the final log files of each (completed) process before it can merge them into the client session log window.

Have a look at the details section of the SAS documentation here:

EDIT:

Playing around a bit more, you can test the connections across them using a unified libname (each process has its own unique work libname so they can not conflict with each other):

Assign libname and options as required on the client machine:

libname testlib 'C:/test' ;
option cpucount=4 sascmd="!sascmd" autosignon;

Define two processes to run in parallel:

* Process 1 ;
rsubmit task1 wait=no;
libname testlib 'C:/test' ;
data testlib.test1 ;
  do i=1 to 1000 ;
    do j=1 to 1000;
      output ;
    end ;
  end ;
run ;
endrsubmit;

* Process 2 ;
rsubmit task2 wait=no;
libname testlib 'C:/test' ;
data testlib.test2 ;
  do i=1 to 1000 ;
    output ;
  end ;
run ;
endrsubmit;

You can then have this following code which will run while process1 is still running but it will be able to access the output dataset of process2:

* Wait for either of the above processes and the process remaining code;
waitfor _any_;

proc sql noprint ;
  select sum(i)
  into :result
  from testlib.test2 
;quit ;

%put *** SUM OF TEST2 IS: &result *** ;

Upvotes: 2

Related Questions