Brisi
Brisi

Reputation: 1831

Unable to add a column as part of hive query

CREATE EXTERNAL TABLE old_events
(day STRING, foo STRING, count STRING, internal_id STRING)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
LOCATION '${INPUT}';;

CREATE EXTERNAL TABLE events
(internal_id, foo STRING, count STRING)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
LOCATION '${OUTPUT}';;

INSERT OVERWRITE TABLE events
SELECT internal_id, e2.foo, count(e1.foo)
FROM old_events e2
LEFT OUTER JOIN old_events e1
ON e1.foo = e2.foo
WHERE e1.event = 'event1'
AND e2.event = 'event2'
GROUP BY e2.foo;

I get the following error, if internal_id is part of the select clause FAILED: Error in semantic analysis: line 5:8 Expression Not In Group By Key internal_id

In order to avoid this, I added internal_id to the GROUP BY column and it results in the following error. FAILED: Error in semantic analysis: Column internal_id Found in more than One Tables/Subqueries

How do I get past this issue?

Upvotes: 1

Views: 795

Answers (1)

teledi
teledi

Reputation: 61

Brisi e1, e2 both has the column internal_id so you have to point out which column you want eg:

INSERT OVERWRITE TABLE events
SELECT e2.internal_id, e2.foo, count(e1.foo)
FROM old_events e2
LEFT OUTER JOIN old_events e1
ON e1.foo = e2.foo
AND e1.event = 'event1'
WHERE e2.event = 'event2'
GROUP BY e2.internal_id, e2.foo;

Upvotes: 3

Related Questions