Lamyanba
Lamyanba

Reputation: 31

create new data set by grouping SAS

NAME  DATE
----  ---------- 
BOB   24/05/2013
BOB   12/06/2012
BOB   19/10/2011
BOB   05/02/2010
BOB   05/01/2009
CARL  15/05/2011
LOUI  15/01/2014
LOUI  15/05/2013
LOUI  15/05/2012

DATA newdata;
  SET mydata;
  count + 1;
  IF FIRST.name THEN count=1;
  BY name DESCENDING date;
run;

here i got count group wise 1,2,3 so on..I want the output of name(all obs of bob) if count> 3. please help me..

Upvotes: 0

Views: 93

Answers (3)

Jay Corbett
Jay Corbett

Reputation: 28391

Here are a couple different ways to do this using SUBQUERIES in PROC SQL

Data HAVE;
    Length NAME $50;
    Input Name $ Date: ddmmyy10.;
    Format date ddmmyy10.;
    datalines;
BOB  24/05/2013
BOB  12/06/2012
BOB  19/10/2011
BOB  05/02/2010
BOB  05/01/2009
CARL 15/05/2011
LOUI 15/01/2014
LOUI 15/05/2013
LOUI 15/05/2012
;
Run;

Using a multiple-value subquery in the Where statement

    Proc sql;
        Create table WANT1 as
        Select *
        From Have
        Where Name in (Select name from have b group by b.name having count(b.name)>3);
    Quit;

Using a subquery in the From clause

    Proc sql;
        Create table WANT2 as
        Select a.name, a.date
        From Have a Inner Join (select name, count(name) as Count from have b group by b.name having Count>3)
            On a.name=b.name
        ;
    Quit;

Upvotes: 0

Reeza
Reeza

Reputation: 21264

I think this shows the power of SQL - though some would say since this generates a NOTE in the log it isn't good practice. Use the GROUP & HAVING clause in SQL to create a count of the names that you then limit to 3.

proc sql;
create table want as
select *
from have
group by name
having count(name)>3;
quit;

Upvotes: 0

Joe
Joe

Reputation: 63424

The simplest way to do that is to output the last row for each ID if it is > 3, then merge that dataset back to your master dataset, keeping only matches. You could also use PROC FREQ to generate the dataset of counts and merge to that.

You can do it in a single datastep using a DoW loop, but that's more complicated, so I wouldn't recommend a new user do that.

Upvotes: 1

Related Questions