intersecting one dataset based on second dataset in sas

Question

I have a dataset called 'primary_data' which looks like this:

Obs  Id     Month   Name    Salary
1   10006   JAN1926 name1   3273
2   10006   FEB1926 name1   7143
3   10007   JAN1926 name2   4274
4   10008   JAN1926 name3   2591
5   10008   FEB1926 name3   2394
6   10009   JAN1926 name4   4416
.
.

And I have another dataset called 'selection' which looks like this:

Obs Id
1   10006
2   10008
.
.

Note: Ids are unique in 'selection' dataset

I need to take a subset of primary_data where the Ids are taken from 'selection' dataset. I tried merge but I am getting the whole primary_data as my output.

data merged_data;
    merge primary_data selection;
    by Id;
run;

In my example above, I would like merged_data to have entries corresponding to Ids: 10006 and 10008 but not from Ids 10007 and 10009. That is, merged_data should look like:

Obs  Id     Month   Name    Salary
1   10006   JAN1926 name1   3273
2   10006   FEB1926 name1   7143
3   10008   JAN1926 name3   2591
4   10008   FEB1926 name3   2394
.
.

I do not understand, what am I doing wrong.

Robert Soszyński · Accepted Answer

If you want to use datastep to merge, you need to use in table option, because you want to use inner join.

e.g. in in2 variable you have information if record from selection table was used in output record:

data merged_data;
    merge primary_data(in=in1) selection(in=in2);
    by Id;

    if in1 and in2; /* inner join */
run;

intersecting one dataset based on second dataset in sas

Answers (1)

Related Questions