Allan Cumming
Allan Cumming

Reputation: 91

Is it possible to extract data from a dataset without merging it into another dataset?

I would like to find out if it is possible to perform conditional clauses on a dataset that has not been merged into the current dataset?

"RAND = 1 if subject has non-missing SUBJNO in DEMOG and at least one non-missing RANDSEQ in DOSEADM and SUBJNO is not included in SFL."

Above is what is required.

For clarity the datasets are DEMOG, DOSEADM, and SFL. SFL is a screening failure dataset, so it will only contain subjects that fail screening. DEMOG is a demography dataset and will only include subjects that passed screening. DOSEADM is a dosing administration dataset.

In OOP psuedocode it could possibly be something like the following.

If ^missing(DEMOG.subjno) and nmiss(DOSEADM.randseq) >=1 and SUBJNO ^in(SFL.SUBJNO) then flag.

Is it possible for something similar to be done in SAS? Or is it possible to do something similar in SQL within SAS?

Upvotes: 1

Views: 97

Answers (2)

scott
scott

Reputation: 2275

I think I understood the second condition you were going for correctly, but please clarify if I am wrong.

You can do this using sub-queries in proc sql. Each sub-query must only return one column of data, then using a case clause you can test whether your SUBJNO is in each of the conditions and return 1 if true for all, else 0.

 proc sql noprint;

    create table table_new as
    select *, case
                when SUBJNO in (select distinct SUBJNO from DEMOG where SUBJNO ne .) /*this finds where subjno not missing in DEMOG table*/
                    and SUBJNO not in (select distinct SUBJNO from SFL) /*this finds where subjno not in SFL table*/
                    and SUBJNO in (select distinct SUBJNO from DOSEADM where RANDSEQ = . group by SUBJNO having count(SUBJNO) > 1) /*This finds the subjno having >1 missing randseq observations*/ then 1
                else 0
                end as RAND
    from table_old
    ;
  quit;

Upvotes: 1

Joe
Joe

Reputation: 63434

I would create formats for the subjects in the datasets. For example, you can take DEMOG:

data for_fmt; *create a dataset for formats, name it something relevant to you;
set dmog;
start=subjno;
label='1';
fmtname='DEMOG'; *include $ in front if subjno is a character variable;
output;
if _n_ = 1 then do;
hlo='o';
start=.; *or ' ' if char;
label='0';
output;
run;

proc format cntlin=for_fmt; *load to formats library;
quit;

data want;
set have;
if put(subjno,DEMOG.)='1' then output; *keep only DEMOG subjects;
run;

You can do that with each of those datasets (the DOSEADM condition seems easy to implement here as well) and then use the three formats.

Upvotes: 1

Related Questions