Ignacio Alorre
Ignacio Alorre

Reputation: 7605

Hive - When adding a new column to a table I get: SemanticException [Error 10002]: Invalid column reference

I am trying to overwrite a table by merging other 3 tables. In the following way:

  1. I will take all columns from the table1
  2. I will take just two columns from table2
  3. I will add a new columns based on the values of a column in table 3

The three tables: table1, table2 and table3 have as key id

This is the query I am trying (So far I am trying it without the INSERT OVERWRITE TABLE final_table, just with the SELECT to see the results that would be inserted in the table)

--INSERT OVERWRITE TABLE final_table 
select 
        t1.*,
        t2.field1,
        t2.field2,
        CASE WHEN ( t3.indicator = 'F' OR 
                    t3.indicator = 'O' 
                    ) THEN 'Y' ELSE 'N' END AS new_field3
from 
table1 t1 
LEFT OUTER JOIN table2 t2
ON (t1.id = t2.id)
LEFT OUTER JOIN table3 t3
ON (t1.id = t3.id);

But in the CASE WHEN I am getting the following error:

Error while compiling statement: FAILED: SemanticException [Error 10002]: Line 7:17 Invalid column reference 'new_field3'

What am I doing wrong?

Upvotes: 0

Views: 2164

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1269773

I don't see a semantic error, but try writing the query as:

select t1.*, t2.field1, t2.field2,
       (case when t3.indicator in ('F', 'O') then 'Y' else 'N'
        end) as new_field3
from table1 t1 left outer join
     table2 t2
     on t1.id = t2.id left outer join
     table3 t3
     on t1.id = t2.id;  -- ???

Most of the changes are cosmetic. IN is recommended instead of lists of OR expressions -- if only because it is easier to write and to read.

More importantly, the join conditions on table3 do not include table3.

Neither of these -- to the best of my knowledge -- would generate a semantic exception. However, the second means the query is not doing what you think it is doing.

Upvotes: 1

Related Questions