Error while compiling statement: FAILED: SemanticException [Error 10002]

Question

select d.order_type from migu_td_aaa_order_log_d d where  exists(select 1 
from migu_user r where r.user_id = '156210106' and r.user_num = 
d.serv_number) and d.product_id in ('2028594290','2028596512','2028597138' ) 
order by d.opr_time desc limit 1

why the above sql failed ,indicates : FAILED: SemanticException [Error 10002]: Line 4:11 Invalid column reference 'opr_time'

but the below one works :

select temp.order_type from (
select d.* from migu_td_aaa_order_log_d d where  exists(select 1 from 
migu_user r where r.user_id = '156210106' and r.user_num = d.serv_number) 
and d.product_id in ('2028594290','2028596512','2028597138' ) order by 
d.opr_time desc limit 1) temp;

this one works fine ,too ,and much more efficient than the second one:

select d.* from migu_td_aaa_order_log_d d where  exists(select 1 from 
migu_user r where r.user_id = '156210106' and r.user_num = d.serv_number) 
and d.product_id in ('2028594290','2028596512','2028597138' ) 
order by d.opr_time desc limit 1

I only need to get order_type field,so even though the second one works,but it cost much more time. Can anyone help me? Thanks a lot!

David דודו Markovitz · Accepted Answer

1.

Hive currently have an order by limitation.
The current status of this issue is PATCH AVAILABLE.

see -
"Can't order by an unselected column"
https://issues.apache.org/jira/browse/HIVE-15160

2.

You might want to get familiar with LEFT SEMI JOIN which is a cleaner syntax for EXISTS https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Joins#LanguageManualJoins-JoinSyntax

3.

using min / max over a struct / named_struct can be used instead of order by ... asc / desc and limit 1

Here is an alternative solution:

select      max(named_struct('opr_time',d.opr_time,'order_type',d.order_type)).order_type

from                        migu_td_aaa_order_log_d d 
            
            left semi join  migu_user               r 
            
            on              r.user_num  =  
                            d.serv_number 
                        
                        and r.user_id   = '156210106' 

where       d.product_id in ('2028594290','2028596512','2028597138') 
;

P.s.

You seriously want to consider to treat IDs (user_id, product_id) as numeric and not as strings.

Error while compiling statement: FAILED: SemanticException [Error 10002]

Answers (2)

1.

2.

3.

Related Questions