Jenny
Jenny

Reputation: 1

Code running slow in SAS, any idea what I can do to run it quicker?

I am a beginner with SAS and trying to create a table with code below. Although the code has been running for 3 hours now. The dataset is quite huge (150000 rows). Although, when I insert a different date it runs in 45 mins. The date I have inserted is valid under date_key. Any suggestions on why this may be/what I can do? Thanks in advance

proc sql;
create table xyz as
select monotonic() as rownum ,*
from x.facility_yz   
where (Fac_Name = 'xyz' and (Ratingx = 'xyz' or Ratingx is null) )
and Date_key = '20000101'
;
quit;

Tried running it again but same problem

Upvotes: 0

Views: 2114

Answers (2)

SMA
SMA

Reputation: 15

What kind of libname to you use ?

if you are running implicit passthrough using sas function, it would explain why it takes so long.

If you are using sas/connect to xxx module, first add option to understand what is going on : options sastrace=,,,d sastraceloc=saslog;

You should probably use explicit passthrough : using rdbms native language to avoid automatic translation of your code.

Upvotes: 0

Stu Sztukowski
Stu Sztukowski

Reputation: 12899

Is your dataset coming from an external database? A SAS dataset of this size should not take nearly this long to query - it should be almost instant. If it is external, you may be able to take advantage of indexing. Try and find out what the database is indexed on and try using that as a first pass. You may consider using a data step instead rather than SQL with the monotonic() function.

For example, assume it is indexed by date:

data xyz1;
    set x.facility_xyz;
    where date_key = '20000101';
run;

Then you can filter this final dataset within SAS itself. 150,000 rows is nothing for a SAS dataset, assuming there aren't hundreds of variables making it large. A SAS dataset this size should run lightning fast when querying.

data xyz2;
    set xyz1;
    where fac_name = 'xyz' AND (Ratingx = 'xyz' or Ratingx = ' ') );

    rownum = _N_;
run;

Or, you could try it all in one pass while still taking advantage of the index:

data xyz;
    set x.facility_xyz;
    where date_key = '20000101';

    if(fac_name = 'xyz' AND (Ratingx = 'xyz' or Ratingx = ' ') );

    rownum+1;
run;

You could also try rearranging your where statement to see if you can take advantage of compound indexing:

data xyz;
    set x.facility_xyz;
    where     date_key = '20000101'
          AND fac_name = 'xyz' 
          AND (Ratingx = 'xyz' or Ratingx = ' ')
    ;

    rownum = _N_;
run;

More importantly, only keep variables that are necessary. If you need all of them then that is okay, but consider using the keep= or drop= dataset options to only pull what you need. This is especially important when talking with an external database.

Upvotes: 2

Related Questions