gaussblurinc
gaussblurinc

Reputation: 3682

SAS: Data Step and additional row

i have an data step.

   data One;
      set Two;
      /*some code*/
   run;

How to add additional row after last row of new table?

Is it possible to do optimal? (One table may have many rows ~ 100k - 10M)

Thanks! ( it is TOTAL row, yes )

Upvotes: 0

Views: 7290

Answers (4)

Will
Will

Reputation: 21

I don't know why the first answer talked about "totals". The correct way to update a table in-place is:

data newdata;
/* some code to generate your data */
proc append base=olddata data=newdata;
run;

This is much quicker than doing any modification that requires a scan of the dataset.

An interesting way to do the same thing is:

data newdata;
/* some code to generate your data */
data olddata;
modify olddata newdata;
by mykey;
run;

This does an update in-place without rewriting a large dataset. It uses the column key to identify matching rows and allows you to update rows without re-writing the dataset, appending those that are not found at the end.

Upvotes: 2

Longfish
Longfish

Reputation: 7602

Just to add my thoughts, the solution offered by Chris J is a good one in that it only requires one pass if the data. However, doing something like a proc summary and then appending the results to the end of the dataset is easier to code, it all comes down to how efficient each method is. If you wanted to test Chris J's solution, then here is an example using the sashelp.class dataset. This may well be the quickest solution for large datasets.

/* count number of numeric variables and assign to macro variable */
proc sql noprint;
select count(*) into :num_ct from dictionary.columns
where libname='SASHELP' and memname='CLASS' and type='num';
quit;

%put numeric variables = &num_ct.;

/* sum all numeric variables and output at the end as a TOTAL row */
data class;
set sashelp.class end=eof;
array numvar{&num_ct.} _numeric_;
array sumvar{&num_ct.} _temporary_;
do i=1 to &num_ct.;
    sumvar{i}+numvar{i};
end;
output;
if eof then do;
    call missing(of _all_);
    name='TOTAL';
    do i=1 to &num_ct.;
    numvar{i}=sumvar{i};
    end;
output;
end;
drop i;
run;

Upvotes: 1

Chris J
Chris J

Reputation: 7769

data one;
  set two end=eof ;
  /* do stuff */
  output ;
  if eof then do ;
    /* do extra row stuff */
    output ;
  end ;
run ;

Upvotes: 3

DataParadigms
DataParadigms

Reputation: 437

Simply appending the total line to your data set.

data one;
 set Two;
 /* some code * /
run;

data total;
 /* some code or use a proc to generate your totals */
run;

data final;
 set one total;
run;

will append the total row to the dataset one.

If you're generating summary lines / totals, I'd suggest you look into using proc report. There's an option to output a dataset that has any summaries, groupings, etc that you create within the report procedure. It can be a really powerful proc.

Something like this SUGI Paper might be useful to get you started.

EDIT: based on the comments

Summary for all:

 proc report data = sashelp.iris out=summary nowd;
   col species sepallength sepalwidth petallength petalwidth;

   rbreak after / summarize;
 run;

Summarize each species group and get total line

 proc report data = sashelp.iris out=summary nowd;
   col species sepallength sepalwidth petallength petalwidth;
   define species / group;
   rbreak after / summarize;
 run;

Upvotes: 3

Related Questions