user2978621
user2978621

Reputation: 813

First value (Oracle) equivalent in Hive

I want to create hive query for the following.

insert into  tempTableName  
select distinct col_a
        ,  first_value(col_b)  
            over (partition by col_a 
            order by nvl(col_c,0) desc, length(col_b) asc, col_b asc) 
from tableA

As hive does not support first value.I want to know what could be the equivalent in simple query for first_value function. Any suggestions ??

Upvotes: 2

Views: 4292

Answers (2)

Joe K
Joe K

Reputation: 18424

I am not exactly familiar with the oracle semantics here, but isn't this just a group by and arg-min? Structs in hive compare in the order of their fields, so you can do something like this:

select col_a,
min(
  named_struct(
    'col_c', -coalesce(col_c, 0),
    'len' , length(col_b),
    'col_b', col_b
  )
).col_b
from tableA
group by col_a

Upvotes: 2

visakh
visakh

Reputation: 2553

HIVE 0.11 does support FIRST_VALUE.

But as per HIVE JIRA, there's an open issue that you cannot have more than one ORDER BY column in first_value. You haven't reported what error you are getting, but if it's FAILED: SemanticException Range based Window Frame can have only 1 Sort Key, then you have to modify the ORDER BY columns.

Edit: If you are not on HIVE 0.11, then I would suggest installing a UDF for FIRST_VALUE. I guess that would be the straightforward way to do this. You might want to take a look at these UDFS.

Upvotes: 1

Related Questions