Maria
Maria

Reputation: 1

How to remove duplicates in SAS data

I am trying to delete the observations in my data set that are the same across multiple variables.

For example

PIN       Start Date          End Date
1          Jan 1 2014         Jan 3 2014>
1         Jan 1 2014           Jan 3 2015
3         March 2 2014       March 5 2014
4        July 1 2014        July 8 2014
5         July 1 2014        July 8 2014
6        August 9 2014         August 24 2014

I would want to remove those with the same PIN and Start Date.

Upvotes: 0

Views: 1047

Answers (1)

Stu Sztukowski
Stu Sztukowski

Reputation: 12849

Translate the string dates into SAS dates first.

data have2;
    set have(rename=(start_date = _start_date 
                     end_date   = _end_date) );

    start_date = input(strip(_start_date), anydtdte10.);
    end_date   = input(strip(_end_date), anydtdte10.);
   
    format start_date end_date date9.;

    drop _start_date _end_date;
run;

Then use proc sort nodupkey.

proc sort data=have2 nodupkey;
    by pin start_date;
run;

Upvotes: 1

Related Questions