jswtraveler
jswtraveler

Reputation: 355

SAS/SQL Choose random row by group

I've seen a fair number of different solutions to this using MYSQL but some of the functions that they use don't work within SAS. I have a very large list of customer records that looks like this:

DATE      ID    .........  
1/31/13   1    ............  
1/31/13   2    ...........  
2/28/13   1    ............  
3/31/13   1   ............  
3/31/13   2    ............  

And I need to select only 1 row per ID group. The thing is I don't want to just choose the first date for each I want to choose a random or arbitrary(whichever is easier to do) row from the available options. Does anybody know a way I could do this?

Upvotes: 3

Views: 2274

Answers (2)

Michael Richardson
Michael Richardson

Reputation: 4282

Barebones solution:

Proc SurveySelect data=Have out=Want noprint
    Method = urs
    N = 1
    outhits
    rep = 1;
    Strata Id ;
run;

Depending on your goal, the answer could change slightly. If you're bootstrapping (which it appears you may be), I found this paper quite useful: Don't Be Loopy: Re-Sampling and Simulation the SAS® Way

Upvotes: 2

Skyler
Skyler

Reputation: 170

I'm not sure how to do it from one big file but I would write a loop to split up the code into id groups and then do a random sampling.

%macro MACRO();
proc sql noprint;
    select ID into :ID separated by " "
    from Original_table
    ;
quit;

proc sql noprint;
    select count (*) into :NumIDs
    from Original_table
    ;
quit;

%let NumID = &NumIDs;

%do i = 1 %to &NumID %by 1;
    %let loopID = %scan(&ID, &i, " ");

proc sql;
    create table ID_&loopID. as
    select *
    from Original_table
    where ID = &loopID.
    ;
quit;

    proc surveyselect data = ID&loopID. method = SRS rep = 1
    sampsize = 1 out = RandomID_&loopID.;
    id _all_;
run;

    proc append base = Id_Random_all
    data = RandomID_&loopID.
    ;
quit;


%end;
%mend;
%MACRO();

The proc append will then add all of them together into one file to look at. I'm sure there is an easier way to do this but this is what I thought of.

Upvotes: 0

Related Questions