Plug4
Plug4

Reputation: 3928

SAS: Avoid writing a file at every datastep on HD

Is it possible to avoid writing a file at every datastep in SAS? For instance, I start with two SAS data sets called have1 and have2 on my HD. I then do these simple SAS data steps:

data have3;
merge have1 have2; by id;run;

data have3; set have3;
if id='5' then delete;run;

proc sort data=have3; by id;run;

proc summary data=have3;
by id;
output out=have4
sum(expense)=expense;
run;

Can I do the first 2 data steps and the proc sort in memory and then write on the HD have4? [In fact I merge using hash objects].

have3 is a big data set so if I can avoid writing the data on my HD at every data steps that would great.

Upvotes: 1

Views: 95

Answers (2)

Oliver
Oliver

Reputation: 194

There is another more primitive, but simple, way to clean up the various data sets your program produces. Proc datasets will not prevent files from being created, but you can use it to delete any data that has outlived its usefullness. This example will delete have1 and have2.

proc datasets;
 delete have1 have2;
run; 

Upvotes: 0

Joe
Joe

Reputation: 63424

The broad answer to your question is that yes, you can avoid some steps; you can use a view to avoid writing out datasets, in some cases. You also could use a memory library (ramlib) to define a library in memory rather than on a hard disk.

In your specific case, it seems like some of the processing is unnecessary, in any event.

data have3;
merge have1 have2; by id;run;

data have3; set have3;
if id='5' then delete;run;

proc sort data=have3; by id;run;

proc summary data=have3;
by id;
output out=have4
sum(expense)=expense;
run;

could be

data have3;
merge have1 have2;
by id;
if id='5' then delete;
run;

proc summary data=have3;
class id;
output out=have4 sum(expense)=expense;
run;

Class doesn't require sorting, and works effectively like by in this case. There's no reason to separate the merge and the delete (even more efficient might be to use where statements on the incoming datasets).

You could even define have3 as a view, if you wanted.

data have3 /view=have3;  *other code is the same;

You can't have a preexisting dataset named have3 as well in this case or it will fail.

Upvotes: 4

Related Questions