Eyal Marom
Eyal Marom

Reputation: 301

sas select 10 random obs from a table

I have a data set with 1000 OBS and I wish to select 10 random OBS. To my understanding, I need to use RANUNI or RAND, but I can't figure out how to implement.

tanks

Upvotes: 1

Views: 842

Answers (2)

Llex
Llex

Reputation: 1770

This macro selects randomly observations from dataset.

Input:

+---------+---+----+
| counter | x | y  |
+---------+---+----+
|       1 | 2 |  2 |
|       2 | 3 |  6 |
|       3 | 4 | 12 |
|       4 | 5 | 20 |
|       5 | 6 | 30 |
+---------+---+----+

data have;
   do counter=1 to 1000;
      x=counter+1;
      y=counter*x;
      output;
   end;
run;

Macro:

%macro select_random_obs(libname,memname,num);%macro d;%mend d;
/*
libname - libname of your dataset
memname - name of dataset
num - num of obs to select randomly
*/
proc sql noprint; /*select num of obs in your dataset (if it is not static value)*/
   select nobs into:max from dictionary.tables where libname="%upcase(&libname)" and memname="%upcase(&memname)";
quit;
%let rand_list=; /*macro variable that will contains random nums of obs to select*/

data _null_; /*init rand_list macro variable*/
   length tList $32000.;
   n=0;
   do while (n<&num);
      if n=0 then tList="";
      repeat:
      u = rand("Uniform");
      k = ceil( &Max*u );
      str=strip(input(k,best12.));
      do i=1 to countw(tList,' ');
         if scan(tList,i,' ') = k then goto repeat;
      end;
      tList=catx(' ',tList,str);
      n=n+1;
   end;
   call symputx('rand_list',tList);

run;
%put &=rand_list;

data want; /*create new data set that contain right number of random observations*/
   set have;
   if _N_ in (&rand_list);
run;

%mend select_random_obs;

%select_random_obs(work,have,10);

Output:

+---------+-----+--------+
| counter |  x  |   y    |
+---------+-----+--------+
|      33 |  34 |   1122 |
|     344 | 345 | 118680 |
|     466 | 467 | 217622 |
|     478 | 479 | 228962 |
|     552 | 553 | 305256 |
|     861 | 862 | 742182 |
|     890 | 891 | 792990 |
|     904 | 905 | 818120 |
|     922 | 923 | 851006 |
|     941 | 942 | 886422 |
+---------+-----+--------+

Upvotes: 1

PeterClemmensen
PeterClemmensen

Reputation: 4937

There are many ways to do this, but the simplest is probably this

data have;
    do x=1 to 1000;
        output;
    end;
run;

proc surveyselect data=have out=want seed=123 noprint
     method=srs
     sampsize=10;
run;

Upvotes: 2

Related Questions