Reputation: 2838
I have a table (t_stocks) with data like this:
exchanged,stock_symbol,closing_date,closing_price
NSE,TCS,2009-08-09,2200.1
NSE,TCS,2009-08-10,2300.1
NSE,TCS,2009-08-11,12200.1
NSE,TCS,2009-08-12,22300.1
NSE,TCS,2009-09-09,2200.1
NSE,TCS,2009-09-10,2300.1
NSE,TCS,2009-09-11,12200.1
NSE,TCS,2009-09-12,22300.1
NSE,INFY,2009-08-09,2500.34
NSE,INFY,2009-08-10,1500.34
NSE,INFY,2009-08-09,7500.34
NSE,INFY,2009-08-10,14500.34
NSE,INFY,2009-09-09,2500.34
NSE,INFY,2009-09-10,1500.34
NSE,INFY,2009-09-09,7500.34
NSE,INFY,2009-09-10,14500.34
NSE,TCS,2010-08-09,2200.1
NSE,TCS,2010-08-10,2300.1
NSE,TCS,2010-08-11,12200.1
NSE,TCS,2010-08-12,22300.1
NSE,TCS,2010-09-09,2200.1
NSE,TCS,2010-09-10,2300.1
NSE,TCS,2010-09-11,12200.1
NSE,TCS,2010-09-12,22300.1
NSE,INFY,2010-08-09,2500.34
NSE,INFY,2010-08-10,1500.34
NSE,INFY,2010-08-09,7500.34
NSE,INFY,2010-08-10,14500.34
NSE,INFY,2010-09-09,2500.34
NSE,INFY,2010-09-10,1500.34
NSE,INFY,2010-09-09,7500.34
NSE,INFY,2010-09-10,14500.34
...
...
I would need to write a query which generate a report as follow. exchanged, stock_symbol , closing_date , closing_price , yesterday_closing , diff_yesterday_price (Price difference between Yesterday prices and today prices) with an output like this:
+----------------+-------------------+-------------------+--------------------+------------------------+-----------------------+--+
| exchanged | stock_symbol | closing_date | closing_price | yesterday_closing | diff_yesterday_price |
+----------------+-------------------+-------------------+--------------------+------------------------+-----------------------+--+
| NSE | INFY | 2009-08-09 | 2500.34 | NULL | NULL |
| NSE | INFY | 2009-08-09 | 7500.34 | 2500.34 | -5000 |
| NSE | INFY | 2009-08-10 | 14500.34 | 7500.34 | -7000 |
| NSE | INFY | 2009-08-10 | 1500.34 | 14500.34 | 13000 |
| NSE | INFY | 2009-09-09 | 7500.34 | 1500.34 | -6000 |
| NSE | INFY | 2009-09-09 | 2500.34 | 7500.34 | 5000 |
| NSE | INFY | 2009-09-10 | 14500.34 | 1500.34 | -13000 |
| NSE | INFY | 2009-09-10 | 1500.34 | 2500.34 | 1000 |
| NSE | INFY | 2010-08-09 | 7500.34 | 14500.34 | 7000 |
| NSE | INFY | 2010-08-09 | 2500.34 | 7500.34 | 5000 |
.....
.....
May anyone give me some clues to do this in an efficient way.
Thanks in advance,
Regards.
Upvotes: 0
Views: 33
Reputation: 7503
You can use hive window function lag()
to solve this. You can read more about window functions in hive here.
Here is the working DEMO in PostgreSQL
, but same query works in HIVE
as well.
select
exchanged,
stock_symbol,
closing_date,
closing_price,
yesterday_price,
(yesterday_price - closing_price) as diff_yesterday_price
from
(
select
*,
lag(closing_price) over (partition by stock_symbol order by closing_date) as yesterday_price
from stockExchange
) la
order by
stock_symbol,
closing_date
Upvotes: 2