Reputation: 729
I have a dataset called 'primary_data' which looks like this:
Obs Id Month Name Salary
1 10006 JAN1926 name1 3273
2 10006 FEB1926 name1 7143
3 10007 JAN1926 name2 4274
4 10008 JAN1926 name3 2591
5 10008 FEB1926 name3 2394
6 10009 JAN1926 name4 4416
.
.
And I have another dataset called 'selection' which looks like this:
Obs Id
1 10006
2 10008
.
.
Note: Ids are unique in 'selection' dataset
I need to take a subset of primary_data where the Ids are taken from 'selection' dataset. I tried merge but I am getting the whole primary_data as my output.
data merged_data;
merge primary_data selection;
by Id;
run;
In my example above, I would like merged_data to have entries corresponding to Ids: 10006 and 10008 but not from Ids 10007 and 10009. That is, merged_data should look like:
Obs Id Month Name Salary
1 10006 JAN1926 name1 3273
2 10006 FEB1926 name1 7143
3 10008 JAN1926 name3 2591
4 10008 FEB1926 name3 2394
.
.
I do not understand, what am I doing wrong.
Upvotes: 1
Views: 51
Reputation: 1188
If you want to use datastep to merge, you need to use in table option
, because you want to use inner join
.
e.g. in in2 variable
you have information if record from selection table
was used in output record:
data merged_data;
merge primary_data(in=in1) selection(in=in2);
by Id;
if in1 and in2; /* inner join */
run;
Upvotes: 4