Martin Reindl
Martin Reindl

Reputation: 1603

Filter a SAS dataset to contain only identifiers given in a list

I am working in SAS Enterprise guide and have a one column SAS table that contains unique identifiers (id_list).

I want to filter another SAS table to contain only observations that can be found in id_list.

My code so far is:

proc sql noprint;
    CREATE TABLE test AS
    SELECT *
    FROM  data_sample
    WHERE id IN id_list
quit;

This code gives me the following errors:

Error 22-322: Syntax error, expecting on of the following: (, SELECT.

What am I doing wrong?

Thanks up front for the help.

Upvotes: 1

Views: 3101

Answers (3)

Ruben ten Cate
Ruben ten Cate

Reputation: 26

The problem with using a Data Step instead of a PROC SQL is that for the Data step the Data-set must be sorted on the variable used for the merge. If this is not yet the case, the complete Data-set must be sorted first.

If I have a very large SAS Data-set, which is not sorted on the variable to be merged, I have to sort it first (which can take quite some time). If I use the subquery in PROC SQL, I can read the Data-set selectively, so no sort is needed.

My bet is that PROC SQL is much faster for large Data-sets from which you want only a small subset.

Upvotes: 0

Tom
Tom

Reputation: 51621

You can't just give it the table name. You need to make a subquery that includes what variable you want it to read from ID_LIST.

CREATE TABLE test AS
  SELECT *
  FROM data_sample
  WHERE id IN (select id from id_list)
;

Upvotes: 3

Andrew Haynes
Andrew Haynes

Reputation: 2640

You could use a join in proc sql but probably simpler to use a merge in a data step with an in= statement.

data want;
  merge oneColData(in = A) otherData(in = B);
  by id_list;

  if A;

run;

You merge the two datasets together, and then using if A you only take the ID's that appear in the single column dataset. For this to work you have to merge on id_list which must be in both datasets, and both datasets must be sorted by id_list.

Upvotes: 1

Related Questions