kacperdominik
kacperdominik

Reputation: 194

Getting n-th row in a group - SAS / SQL

any hints how do I get the n-th element from the group. SAS 4GL is preferred, SQL will do.

From the list of transactions below, I would like to extract a row corresponding to a 2nd (by date) transaction per each customer.

Input:

input table

Output:

output

Customer C from the example above has just one transaction, hence there is no rows corresponding to him in the output table.

Thank you in advance!

Upvotes: 2

Views: 3815

Answers (4)

momo1644
momo1644

Reputation: 1804

This code will handle the case of multiple items bought on that day. I used Retain Functionality and kept a count of sub-groups.

Data: I added two extra item to A & B on the same day

data have;
 infile datalines dlm=',' dsd;
 informat Transaction_Date yymmdd10.;
 format Transaction_Date yymmdd10.;
 input Transaction_Date Customer $ Product $ Quantity Purchase_Value;
 datalines;
2018-01-15, A , Milk , 1 , 100
2018-01-28, A , Onion , 2 , 140
2018-01-28, A , corn , 2 , 140
2018-02-13, B , Carrot , 1 , 50
2018-03-20, B , Rice , 10 , 40
2018-03-20, B , tomato , 10 , 40
2018-04-14, B , Carrot , 1 , 50
2018-06-02, C , Candy  ,5 , 125
;
run;

Code: The Counter is your Nth row

proc sort data=have; by Customer Transaction_Date; run;
data want;
set have;
by Customer  Transaction_Date ;
retain outter;
retain inner;
retain counter;
if first.Customer then do; outter=1; counter=0; end; else outter+1;
if first.Transaction_Date then do; inner=1; counter+1; end; else inner+1;
if counter=2 then output; 
run;

Output:

 Transaction_Date=2018-01-28 Customer=A Product=Onion Quantity=2 Purchase_Value=140 outter=2 inner=1 counter=2
 Transaction_Date=2018-01-28 Customer=A Product=corn Quantity=2 Purchase_Value=140 outter=3 inner=2 counter=2 
 Transaction_Date=2018-03-20 Customer=B Product=Rice Quantity=10 Purchase_Value=40 outter=2 inner=1 counter=2
 Transaction_Date=2018-03-20 Customer=B Product=tomato Quantity=10 Purchase_Value=40 outter=3 inner=2 counter=2

Upvotes: 1

Kiran
Kiran

Reputation: 3315

  1. if you are trying to do in database processing then @alfonsohdez08 is the way to go . Otherwise @J_Lard is the way to go. One another way by using proc sql is shown below

    proc sql;
    create table want(drop = rnk) as 
    select a.*,
    (select count(transaction_date) from have b
    where a.customer=b.customer
    and a.transaction_date>=b.transaction_date) as rnk
      from have a 
    where calculated rnk = 2;
    

Upvotes: 0

J_Lard
J_Lard

Reputation: 1103

There are a couple of ways you can accomplish this using SAS. The easiest being to sort the data set and then use a data step to output the records you want.

proc sort data=have;
by Customer TransactionDate;
run;

data want(drop=counter);
set have;
by Customer TransactionDate;
retain counter;
if first.Customer then counter=1;
else counter+1;
if counter=2 then output;
run;

Upvotes: 4

alfonsohdez08
alfonsohdez08

Reputation: 88

The way to achieve the desired result set is by applying windowing functions. Please, check the query below.

SELECT
    c.TransactionDate
    , c.Customer
    , c.Product
    , c.Quantity
    , c.[Purchase Value]
FROM (
    SELECT
        TransactionDate
        , Customer
        , Product
        , Quantity
        , [Purchase Value]
        , RANK() OVER (PARTITION BY Customer ORDER BY TransactionDate) AS Ranking
    FROM transactions
) AS c
WHERE c.Ranking = 2;

I hope this help you!

EDIT

This query works in case you're working with an engine that supports windows functions how I'm using in the query above (SQL Server in this case).

Upvotes: 1

Related Questions