Reputation: 607
It seems to me that when I re-run an proc sql code in SAS, and I use the inobs= and the outobs= options, they give me the same rows every time. This leads me to conclude that inobs= and outobs= just give first rows available. Is this true? Is there an option like outobs= that will give me outobs=n random rows?
Upvotes: 1
Views: 5612
Reputation: 143
If find the previous answers a bit confusing. Below you find a relative easy solution for generating a random sample each time. By multiplying an exisiting numeric column (like e.g. an id, or zipcode etc) you can order the set by this newly created random column.
PROC SQL OUTOBS=5;
CREATE TABLE work.randomshizzle as
SELECT A.*
FROM ( SELECT ZIP * RANUNI(-1) AS RANDEB, t1.*
FROM sashelp.ZIPCODE t1
) A
ORDER BY A.RANDEB;
QUIT;
Works like a charm.
Upvotes: 1
Reputation: 9569
I think the closest thing you can easily do within proc sql
itself is something like this. It won't give you a precise number of rows, but it should get you roughly the right number provided that you know in advance how many rows there are in each of the source tables.
/*20% random sample, same sample each run */
proc sql;
create table example as
select * from sashelp.class(where=(ranuni(1) < 0.2));
quit;
/*20% random sample, different sample each run */
proc sql;
create table example as
select * from sashelp.class(where=(rand('UNIFORM') < 0.2));
quit;
Otherwise, as Reeza has suggested, take samples from your datasets via proc surveyselect
or some other method and pass those into proc sql
.
I'd suggest using where clauses within individual dataset options, so that you can leave the rest of your query unchanged, including any existing where clauses, but there are various ways you could do this.
Upvotes: 2
Reputation: 4554
A dummy number could be induced, then sorted by dummy number, a precise number of rows of data set could be produced.
proc sql outobs=5;
select ranuni(1234) as num,* from sashelp.class order by num;
quit;
Upvotes: 7