Reputation: 609

SQL: get top N for latest month and history for these

I want to find the top N names (those with highest value) for the latest period. And for these names I want to get the historic values.

Per this question Select top 10 records for each category

I have tried the following:

base_table (period, name, value)

CREATE VIEW TOP3 AS
SELECT DISTINCT a.*, COUNT(*) as rank
FROM base_table AS a
LEFT JOIN base_table AS a2 
ON a.period = a2.period and a.value <= a2.value
GROUP BY a.value
HAVING COUNT(*) <= 3
ORDER BY period, rank;

And this gets me the top 3 for each period.

period | name | value | rank
092020   Joe    100      1
092020   Jane   90       2
092020   Doe    80       3

082020   Jan    100      1
082020   Doc    99       2
082020   Pete   98       3
.
.
.

But not the desired result.

Let's say these are the top 3 for the latest month:

period | name | value | rank
092020   Joe    100      1
092020   Jane   90       2
092020   Doe    80       3

Now I want to get the history for these, even though they are outside top 3 previous month, like so:

period | name | value | rank
092020   Joe    100      1
092020   Jane   90       2
092020   Doe    80       3

082020   Joe    95       4
082020   Jane   94       5
082020   Doe    7        50
.        .      .        .
.        .      .        .
.        .      .        .
092019  Joe     42       20
092019  Doe     34       35
092019  Jane    1        100

FYI I am working in SAS, using PROC SQL, so niceties like CTE's, window functions and other stuff are not available to me.

Upvotes: 0

Answers (3)

Tom

Reputation: 51611

Not sure why you would use SQL for this. Here is a method using OUTOBS option of PROC SQL.

First let's make some actual test data:

data have ;
  input period :ddmmyy. name $ value rank ;
  format period yymm7. ;
cards;
01092020   Joe    100      1
01092020   Jane   90       2
01092020   Doe    80       3
01082020   Joe    95       4
01082020   Jane   94       5
01082020   Doe    7        50
01092019  Joe     42       20
01092019  Doe     34       35
01092019  Jane    1        100
;

Now let's set N into a macro variable to make it a little more flexible. Since the example data only has 3 names let's use N of 2.

%let n=2;

So first find the top N for the most recent (MAX) period.

proc sql;
reset outobs=&n;
create table top&n as select name
  from have 
  having period=max(period)
  order by value desc
;

Now reset OUTOBS to MAX and use that TOPn list to pull all of the data from the original dataset for those names.

reset outobs=max;
create table want as 
select * from have
where name in (select name from top&n)
;
quit;

Let's look at the results:

proc print data=want;
run;

Which is:

Obs     period    name    value    rank

 1     2020M09    Joe      100        1
 2     2020M09    Jane      90        2
 3     2020M08    Joe       95        4
 4     2020M08    Jane      94        5
 5     2019M09    Joe       42       20
 6     2019M09    Jane       1      100

Upvotes: 0

Gordon Linoff

Reputation: 1270411

You can use JOIN:

SELECT a.*, ar.rank
FROM base_table a JOIN
     (SELECT a.value, COUNT(*) as rank
      FROM base_table a JOIN
           base_table a2 
           ON a.period = a2.period and a.value <= a2.value JOIN
           (SELECT max(a3.period) as max_period
            FROM base_table a3
           ) a3
           ON a3.max_period = period
      GROUP BY a.value
      HAVING COUNT(*) <= 3
     ) ar
     ON ar.value = a.value
ORDER BY period, rank;

I suspect this would be simpler using a data step, at least for assigning the rank.

Upvotes: 1

luisvenezian

Reputation: 501

Can you try it with subqueries? if not you can create a view instead ordered with some row_number() / rank function to JOIN

SELECT b.period, b.name, b.value
FROM base_table b
INNER JOIN
(  
    -- subquery to get last period 
    SELECT TOP 3 name, period, value
    FROM base_table
    WHERE period = (SELECT MAX(period) FROM base_table)
    ORDER BY 3 DESC
) AS s ON s.name = b.name -- join them

Upvotes: 1

SQL: get top N for latest month and history for these

Answers (3)

Related Questions