Reputation: 781
I have 2 tables: Table 1:
emp_id emp_log_id timestamp
1 234 04 Oct 2019 23:10
1 05 Oct 2019 23:10
2 335 04 Oct 2019 23:10
2 03 Oct 2019 23:10
3 04 Oct 2019 23:10
4 324 04 Oct 2019 23:10
My expected output is:
emp_id emp_log_id timestamp
1 234 04 Oct 2019 23:10
2 335 04 Oct 2019 23:10
3 04 Oct 2019 23:10
4 324 04 Oct 2019 23:10
How to write hive query for this.
Another table has data like: Table 2:
emp_id emp_log_id
1 234
1 05
2 335
2 03
3 04
4 324
How to implement the same requirement in this table 2.
Please help.
Upvotes: 0
Views: 101
Reputation: 1269633
I would recommend row_number()
:
select t.*
from (select t.*,
row_number() over (partition by emp_id
order by (emp_log_id is not null) desc, timestamp desc
) as seqnum
from t
) t
where seqnum = 1
Upvotes: 1
Reputation: 13006
using aggregation function max()
will do it.
select emp_id, max(emp_log_id) as emp_log_id, timestamp
from table1
group by emp_id, timestamp
Upvotes: 1