Data Guy
Data Guy

Reputation: 87

Grouping Logic with if conditions SAS

I have a data set structured like this:

CLIENT_ID    Segment      Yearmonth       
XXXX         A            201305         
XXXX         A            201306         
XXXX         B            201307         
XXXX         B            201308         
XXXX         C            201309         
XXXX         A            201310        
YYYY         D            201306
YYYY         D            201307
YYYY         C            201308
YYYY         D            201309
ZZZZ         E            201309
ZZZZ         E            201309
ZZZZ         E            201309

This is the output i am trying to get

CLIENT_ID    Segment       New_sequencing_Variable    
XXXX         A                       1
XXXX         B                       2
XXXX         C                       3
XXXX         A                       4
YYYY         D                       1
YYYY         C                       2
YYYY         D                       3
ZZZZ         E                       1

The code i have so far is:

HISTORICAL_SEGMENTS2 HISTORICAL_SEGMENTS3 ;
  set GENERAL.HISTORICAL_SEGMENTS ;
  by Client_ID ;
  if first.Client_ID then count = 0 ;
  count + 1 ;
  output HISTORICAL_SEGMENTS2 ; *output every record;
  if first.Client_ID then output HISTORICAL_SEGMENTS3 ;  *output first of each group;
run;

i need something like :

 if previous.segment <> current.segment then output HISTORICAL_SEGMENTS3 

However, am able to get the first clientID, The segment and the sequencing variable but only the first client ID and the segment next to it in the output table HISTORICAL_SEGMENTS3 but only the first record.

How do i include the logic that output while grouping by client_id and yearmonth basically output a row row everytime a segment is different than the previous Yearmonths segment

Upvotes: 0

Views: 75

Answers (2)

G.Arima
G.Arima

Reputation: 1171

This can be simply achieved by the command "notsorted"

data abc;
input client_id $5. segment $2. yearmonth  ;
cards; 
xxxx a 201305
xxxx a 201306
xxxx b 201307
xxxx b 201308
xxxx c 201309
xxxx a 201310
yyyy d 201306
yyyy d 201307
yyyy c 201308
yyyy d 201309
zzzz e 201309
zzzz e 201309
zzzz e 201309
;
run;


data abc1;
set abc;
by client_id segment notsorted;
if first.segment;
if first.client_id then new_sequencing_variable = 0;
new_sequencing_variable + 1;
run;  

My Output:

client_id   segment       new_sequencing_variable    
XXXX         A                       1
XXXX         B                       2
XXXX         C                       3
XXXX         A                       4
YYYY         D                       1
YYYY         C                       2
YYYY         D                       3
ZZZZ         E                       1  

Let me know in case of any clarifications.

Upvotes: 2

Seba
Seba

Reputation: 146

You are basically there already. Just add another count variable for the HISTORICAL_SEGMENTS3 dataset and a lag variable for the previous observations of segment.

data HISTORICAL_SEGMENTS2 HISTORICAL_SEGMENTS3;
    set HISTORICAL_SEGMENTS;
    by Client_ID;
    if first.Client_ID then do;
        count = 0;
        count2 = 0;
    end;
    count + 1;

    prev_segment = lag(segment);
    if prev_segment ne segment then do;
        count2 + 1;
        output HISTORICAL_SEGMENTS3;
    end;
    output HISTORICAL_SEGMENTS2;
    drop prev_segment;
run;

Upvotes: 1

Related Questions