Reputation: 3
I have tried the following query within HUE's Beeswax Query Editor:
SET MAXDATE=(SELECT MAX(DATA_DAY) FROM DB1.DESTINATION_TABLE);
SELECT COUNT(*) FROM DB2.SOURCE_TABLE
WHERE YEAR(DATA_DAY) >= '2015'
AND DATA_DAY > ${HIVECONF:MAXDATE};
This query will not run and produces the following error message:
FAILED: ParseException line 1:4 missing KW_ROLE at 'MAXDATE' near 'MAXDATE' line 1:11 missing EOF at '=' near 'MAXDATE'
Any advice on what the problem is? I don't understand what the KW_ROLE message means.
I come from a SQL Server background and would just run the following within SQL Server, but am trying to find a functional Hadoop/Hive equivalent.
SELECT COUNT(*) FROM DB2.SOURCE_TABLE
WHERE YEAR(DATA_DAY) >= '2015'
AND DATA_DAY > (SELECT MAX(DATA_DAY) FROM DB1.DESTINATION_TABLE)
Upvotes: 0
Views: 1976
Reputation: 5533
As far as I know, hive
support the following syntax too.
SELECT COUNT(*) FROM DB2.SOURCE_TABLE a
JOIN
(SELECT MAX(DATA_DAY) AS max_date FROM DB1.DESTINATION_TABLE) b
WHERE YEAR(a.DATA_DAY) >= '2015'
AND a.DATA_DAY > b.max_date;
But it's not a good implementation if there are bunches of data on DB1.DESTINATION_TABLE
.
In such case each query would task lots of sub-querys in SELECT MAX(DATA_DAY) FROM DB1.DESTINATION_TABLE
.
If possible, you could store your SELECT MAX(DATA_DAY) FROM DB1.DESTINATION_TABLE
result in another table, maybe Max_table
.
Then the sql would be like this:
SELECT COUNT(*) FROM DB2.SOURCE_TABLE
JOIN Max_table
WHERE YEAR(DB2.SOURCE_TABLE.DATA_DAY) >= '2015' and
DB2.SOURCE_TABLE.DATA_DAY > (Max_table.DATA_DAY)
Upvotes: 0
Reputation: 326
Query which you have tried contains syntax issue. HiveConf should surrounded by single quotes.
SET MAXDATE=(SELECT MAX(DATA_DAY) FROM DB1.DESTINATION_TABLE);
SELECT COUNT(*) FROM DB2.SOURCE_TABLE
WHERE YEAR(DATA_DAY) >= '2015'
AND DATA_DAY > '${HIVECONF:MAXDATE}';
Upvotes: 2