galih
galih

Reputation: 511

hive rank over grouped value

gurus, i stumbled in hive rank process, i woluld like to rank transaction in each day (with no repeating rank value for the same trx value)

date      hour trx  rnk
18/03/2018  0   1   24
18/03/2018  1   2   23
18/03/2018  2   3   22
18/03/2018  3   4   21
18/03/2018  4   5   20
18/03/2018  5   6   19
18/03/2018  6   7   18
18/03/2018  7   8   17
18/03/2018  8   9   16
18/03/2018  9   10  15
18/03/2018  10  11  14
18/03/2018  11  12  13
18/03/2018  12  13  12
18/03/2018  13  14  11
18/03/2018  14  15  10
18/03/2018  15  16  9
18/03/2018  16  17  8
18/03/2018  17  18  7
18/03/2018  18  19  6
18/03/2018  19  20  5
18/03/2018  20  21  4
18/03/2018  21  22  3
18/03/2018  22  23  2
18/03/2018  23  24  1
17/03/2018  0   1   24
17/03/2018  1   2   23
17/03/2018  2   3   22
17/03/2018  3   4   21
17/03/2018  4   5   20
17/03/2018  5   6   19
17/03/2018  6   7   18
17/03/2018  7   8   17
17/03/2018  8   9   16
17/03/2018  9   10  15
17/03/2018  10  11  14
17/03/2018  11  12  13
17/03/2018  12  13  12
17/03/2018  13  14  11
17/03/2018  14  15  10
17/03/2018  15  16  9
17/03/2018  16  17  8
17/03/2018  17  18  7
17/03/2018  18  19  6
17/03/2018  19  20  5
17/03/2018  20  21  4
17/03/2018  21  22  3
17/03/2018  22  23  2
17/03/2018  23  24  1

here is my code

select a.date, a.hour, trx, rank() over (order by a.trx) as rnk from(
select date,hour, count(*) as trx from  smy_tb
group by date, hour
)a
limit 100;

the problem is: 1. rank value repeated with the same trx value 2. rank value continued to next date (it should be grouped for date and hour, so each date will only return 24 rank value)

need advice, thank you

Upvotes: 0

Views: 195

Answers (2)

galih
galih

Reputation: 511

as explained by @BKS

this is the resolved code

select a.date, a.hour, trx, row_number() over (partition by a.date order by a.trx desc) as rnk from(
select date,hour, count(*) as trx from  smy_tb
group by date, hour
)a
limit 100;

Upvotes: 0

Vamsi Prabhala
Vamsi Prabhala

Reputation: 49270

You should partition by date column and use a specific ordering.

rank() over (partition by a.date order by a.hour desc)

Upvotes: 1

Related Questions