Champ
Champ

Reputation: 17

How to use a CTE to update a table - Hive

I'm new to hive and trying to update a table (table_A) with a new column based on the values retrieved from another table (table_B). table_B is used as a lookup table for the categorical variable descriptions.

Ex: table_A

index field_x field_y
1 L S
2 H H
3 M S

Ex: table_B

variable_name variable_category variable_category_description
field_x L Low
field_x M Medium
field_x H High
field_y S Soft
field_y H Hard

Based on the table_A and table_B I need to obtain following table.

index field_x field_x_description field_y field_y_description
1 L Low S Soft
2 H High H Hard
3 M Medium S Soft

I tried following in hue hive editor just to add one column at the beginning. My editor does not support update statements.

CREATE TABLE table_C AS 
SELECT index,
    field_x, 
    field_y, 
(SELECT variable_category_description
    FROM table_B 
    where table_B.variable_name = 'field_x' and  table_B.variable_category= table_A.field_x
AS field_x_description
FROM table_A

I get following error

Error while compiling statement: FAILED: ParseException line 5:5 cannot recognize input near 'SELECT' 'variable_category_description' 'FROM' in expression specification

A fix for this is much appreciated.

Upvotes: 1

Views: 200

Answers (1)

leftjoin
leftjoin

Reputation: 38325

You need two joins (INNER or LEFT) with table_B

CREATE TABLE table_C AS 
SELECT a.index,
       a.field_x, 
       x.variable_category_description AS field_x_description,
       a.field_y, 
       y.variable_category_description AS field_y_description
FROM table_A
     LEFT JOIN table_B x ON x.variable_name  = 'field_x' 
                        and x.variable_category = a.field_x

     LEFT JOIN table_B y ON y.variable_name  = 'field_y' 
                        and y.variable_category = a.field_y
;

Upvotes: 2

Related Questions