Sahil Jain
Sahil Jain

Reputation: 59

Pick random observation for each by group in SAS

I have two datasets: form and pool having similar table structure.

a) Dataset form enter image description here

b) Dataset pool pool dataset

I want to populate the null values under Sub-Level column in form dataset with Sub-Level values in pool dataset where a Sub-Level in pool dataset shoud belong to same Key and Level of form dataset.

How can be this done in SAS ?

EDIT 1 : null values that are populated in form dataset must be distinct or unique within each by group of key, level.

Upvotes: 1

Views: 834

Answers (1)

Jetzler
Jetzler

Reputation: 797

  1. Create a new row ID in form: key, level, new_id.
  2. Random sort pool on group level and create the same id.
  3. Left join by key, level, new_id and replace missing values.

As there are less or equal values per group in form compared to pool and all values are disjunct and unique you will complete form without generating duplicates.

Code example:

data form;
  set form;
  retain new_id;
  if first.level then new_id = 0;
  new_id + 1;
run;
data pool;
  set pool;
  ran_num = ranuni(12345); /* generate random uniform number */
run;
proc sort data=pool; by key level ran_num; run; /* sort by random number */
data pool;
  set pool;
  retain new_id;
  if first.level then new_id = 0;
  new_id + 1;
run;
proc sql;
create table form_full as
  select a.key, a.level, coalescec(a.sub_level,b.sub_level) as sub_level
    from form a
      left join pool b
        on a.key eq b.key and a.level eq b.level and a.new_id eq b.new_id
      ;
quit;

edit: In case pool has fewer rows in a group than form, change new_id in form:

data form;
  set form;
  retain new_id;
  if first.level then new_id = 0;
  if sub_level ne . then new_id + 1;
run;

Upvotes: 1

Related Questions