user3447653
user3447653

Reputation: 4158

Lead function in Hive

I am trying to retrieve the next "name1" after a specific "name2". For example, next "name1" after "complete2" in "name2" is "init3". This is the query I use, but it returns null.

    date_time                    name1        name2           
    2017-11-23 00:09:46          init1        complete1
    2017-11-23 00:09:48          init2        complete2
    2017-11-23 00:09:50          init3        complete3


   select date_time,name1,name2,
   lead(name1,1) over (order by date_time) as next_word
   from tab1
   where date_time between "2017-11-23 00:00:00" and "2017-11-23 23:59:59" 
   and name2 like "%complete2%";

Upvotes: 0

Views: 12721

Answers (1)

Ramesh
Ramesh

Reputation: 1484

Here is the solution

SELECT * 
  FROM (SELECT date_time,name1,name2,
               LEAD(name1,1) over (ORDER BY date_time) AS next_word
          FROM tab1
         WHERE date_time between "2017-11-23 00:00:00" and "2017-11-23 23:59:59" ) v
 WHERE name2 LIKE "%complete2%";

Output

2017-11-23 00:09:48 init2   complete2   init3

Hive evaluates the WHERE conditions before applying the LEAD. In your query that would have limited the LEAD to be applied in one record for which there was no LEAD.

You can use a subquery like I have done above or use a CTE/ WITH Clause.

Hope this helps!

Upvotes: 5

Related Questions