Reputation: 397
I have two tables on SAS "Table_A" and "Table_A_Archive", as part my ETL process "Table_A" is created on a daily basis and the data should be archived on "Table_A_Archive". When the data is archived on "Table_A_Archive" a a flag will be created/updated "DT_FLAG".
On the first day this is how the table would look like
"Table_A"
| ID | Load_Date
------ -------------
| 100 | 01JUN2020:12:13:56
"Table_A_Archive"
| ID | Load_Date | DT_FLAG
------ --------------------- ---------
| 100 | 01JUN2020:12:13:56 | 1
On Day 2
"Table_A"
| ID | Load_Date
------ ------------
| 101 | 02JUN2020:12:13:56
"Table_A_Archive"
| ID | Load_Date | DT_FLAG
------ --------------------- ---------
| 100 | 01JUN2020:12:13:56 | 2
| 101 | 02JUN2020:12:13:56 | 1
The new data should be loaded with the DT_FLAG of 1 and old records DT_FLAG should be incremented by 1. Load_Date is the pivotal point. I have written a SAS code but it seems a bit messy, can someone please help me with a SAS Datastep
%macro Cntl_archive(table_name=,arch_table_name=);
%GLOBAL WRK;
%if %sysfunc(exist(&arch_table_name.)) %then %do;
proc append base=&arch_table_name. data=&table_name. force;
run;
proc sql;
Create table TEMP as
Select distinct Load_Date,Load_Date as WRK from &arch_table_name.
order by Load_Date desc
;quit;
proc rank data=TEMP descending out=TEMP;
var WRK;
ranks count;
run;
data &arch_table_name. (drop=DT_FLAG);
set &arch_table_name.;
run;
proc sql;
Create table &arch_table_name. as
Select T0.*,T1.count as DT_FLAG from &arch_table_name. T0
inner join TEMP T1 on T0.Load_Date=T1.Load_Date
;quit
%end;
%else %do;
data &arch_table_name.;
set &table_name.;
DT_FLAG= 1;
IS_ACTIVE='';
run;
%end;
%mend Cntl_archive;
Upvotes: 1
Views: 832
Reputation: 397
I tried solving it by this method.
%macro Cntl_archive(table_name=,arch_table_name=);
%if %sysfunc(exist(&arch_table_name.)) %then %do;
data Data_append;
set &table_name.;
if _n_ = 1
then do;
set &arch_table_name.(keep=dt_flag) point=nobs nobs=nobs;
dt_flag + 1;
end;
run;
proc append base=&arch_table_name. data=Data_append force;
run;
%end;
%else %do;
data &arch_table_name.;
set &table_name.;
DT_FLAG= 1;
IS_ACTIVE='';
run;
%end;
%mend Cntl_archive;
Upvotes: 0
Reputation: 27518
Use Proc APPEND
and compute DT_FLAG
on the fly when needed. No need to mess with the archive other than add records to it.
On-the-fly would be DATA step view.
Example:
The example want
data sets are in WORK.
but would be some PERM.
in your real word case.
* simulate a clean start and some ETL activity with APPEND archiving;
proc delete data=want;
proc delete data=want_archive;
* DAY 1, load #1;
data DAILY_ETL;
ID = 100; load_date = today()-100; format load_date yymmdd10.;
run;
data want;
set DAILY_ETL;
run;
proc append base=want_archive data=want;
run;
* DAY 2, load #2;
data DAILY_ETL;
ID = 100; load_date = today()-99; format load_date yymmdd10.;
run;
data want;
set DAILY_ETL;
run;
proc append base=want_archive data=want;
run;
* DAY 4, load #3;
data DAILY_ETL;
ID = 100; load_date = today()-97; format load_date yymmdd10.;
run;
data want;
set DAILY_ETL;
run;
proc append base=want_archive data=want;
run;
and viewing
* view for on-the-fly DT_FLAG (do once);
data want_archive_v;
set want_archive nobs=N;
dt_flag = N - _N_ + 1;
run;
dm 'viewtable want_archive_v';
Upvotes: 1
Reputation: 51621
Here is a method using the MODIFY
statement to update the values of DT_FLAG in the existing observations in place and append the new values.
First lets make the initial A and use it to create an empty A_ARCHIVE with the extra variable. (Note I renamed your timestamp variable to avoid the confusion caused by having a variable named "date" that has datetime values instead of date values.)
data a ;
input id load_dt :datetime.;
format load_dt datetime19.;
cards;
100 01JUN2020:12:13:56
;
data a_archive;
stop;
set a ;
dt_flag=0;
run;
Now let's append A to A_ARCHIVE.
data a_archive;
do while(not eof1);
modify a_archive end=eof1;
dt_flag=sum(dt_flag,1);
replace;
end;
do until(eof2);
set a end=eof2;
dt_flag=1;
output;
end;
run;
Now you can make a new version of A and re-run the same data step to append it.
data a ;
input id load_dt :datetime.;
format load_dt datetime19.;
cards;
101 02JUN2020:12:13:56
;
data a_archive;
do while(not eof1);
modify a_archive end=eof1;
dt_flag=sum(dt_flag,1);
replace;
end;
do until(eof2);
set a end=eof2;
dt_flag=1;
output;
end;
run;
Results:
Obs id load_dt dt_flag
1 100 01JUN2020:12:13:56 2
2 101 02JUN2020:12:13:56 1
Upvotes: 1
Reputation: 107747
Consider again proc sql
with a count correlated subquery. Unfortunately, SAS does not allow updating a table with values to itself, so a temp table copy is used. Below assumes ID is incremented with each day.
proc sql;
insert into Table_A_Archive (ID, Load_Date)
select ID, Load_Date
from Table_A;
create table temp as
select ID, Load_Date from Table_A_Archive;
update Table_A_Archive t
set DT_Flag = (select count(*)
from temp sub
where t.ID <= sub.ID
and t.Load_Date = sub.Load_Date);
drop table temp;
quit;
Upvotes: 0
Reputation: 1394
I think do the same thing by a merge
statement:
%if %sysfunc(exist(Table_A_Archive)) = 0 %then %do;
data Table_A_Archive;
set Table_A;
run;
%end;
data Table_A_Archive;
merge Table_A_Archive(in=ALL) Table_A;
by ID;
if ALL then DT_FLAG = sum(DT_FLAG,1);
else DT_FLAG = 1;
run;
Consider of you may want this daily work be as quick as possibly, I suggest to use update
or modify
statement to replace merge
:
%if %sysfunc(exist(Table_A_Archive)) = 0 %then %do;
data Table_A_Archive;
set Table_A;
run;
%end;
data Table_A_Archive;
update Table_A_Archive Table_A;
by ID;
if _iorc_ = %sysrc(_sok) then DT_FLAG = sum(DT_FLAG,1);
run;
It is more efficient because it can update(or modify) data without creating a copy of the data set.
Upvotes: 1