Helen
Helen

Reputation: 607

Proc SQL: inobs and outobs not random

It seems to me that when I re-run an proc sql code in SAS, and I use the inobs= and the outobs= options, they give me the same rows every time. This leads me to conclude that inobs= and outobs= just give first rows available. Is this true? Is there an option like outobs= that will give me outobs=n random rows?

Upvotes: 1

Views: 5612

Answers (3)

ByronSchuurman
ByronSchuurman

Reputation: 143

If find the previous answers a bit confusing. Below you find a relative easy solution for generating a random sample each time. By multiplying an exisiting numeric column (like e.g. an id, or zipcode etc) you can order the set by this newly created random column.

PROC SQL OUTOBS=5;
CREATE TABLE work.randomshizzle as
SELECT A.*
FROM ( SELECT ZIP * RANUNI(-1) AS RANDEB, t1.*
       FROM sashelp.ZIPCODE t1 
      ) A
ORDER BY A.RANDEB;
QUIT;

Works like a charm.

Upvotes: 1

user667489
user667489

Reputation: 9569

I think the closest thing you can easily do within proc sql itself is something like this. It won't give you a precise number of rows, but it should get you roughly the right number provided that you know in advance how many rows there are in each of the source tables.

/*20% random sample, same sample each run */
proc sql;
  create table example as 
    select * from sashelp.class(where=(ranuni(1) < 0.2));
quit;

/*20% random sample, different sample each run */
proc sql;
  create table example as 
    select * from sashelp.class(where=(rand('UNIFORM') < 0.2));
quit;

Otherwise, as Reeza has suggested, take samples from your datasets via proc surveyselect or some other method and pass those into proc sql.

I'd suggest using where clauses within individual dataset options, so that you can leave the rest of your query unchanged, including any existing where clauses, but there are various ways you could do this.

Upvotes: 2

Shenglin Chen
Shenglin Chen

Reputation: 4554

A dummy number could be induced, then sorted by dummy number, a precise number of rows of data set could be produced.

proc sql outobs=5;
    select ranuni(1234) as num,* from sashelp.class order by num;
quit;

Upvotes: 7

Related Questions