Reputation: 1434
I have the following query that I run over hive/impala:
select count(p.id) as tweet_count, p.author as author,p.profile_image_url as profile_image_url,p.screen_name as screen_name,
concat_ws('/',min(p.postday),min(p.postmonth),min(p.postyear) ) as creation_date,p.message message,af.followerid as follower
from post p
inner join author_follower af on af.id like if(p.author= null, '', concat(p.author,'%'))
where p.hashtaglist like 'hashtagtobeused'
group by author,profile_image_url,screen_name,message,follower
ORDER BY cast(min(postyear) as int),cast(min(postmonth) as int),cast(min(postday) as int),cast(min(posthour) as int) ASC;
but for some reason I get the following error result
Your query has the following error(s):
Bad status for request 3304: TGetOperationStatusResp(status=TStatus(errorCode=None, errorMessage=None, sqlState=None, infoMessages=None, statusCode=0), operationState=5, errorMessage=None, sqlState=None, errorCode=None)
I checked the query and I cannot find a problem with it, can anyone please help and guide to where the problem is?why do I have this error instead of result set
Upvotes: 1
Views: 1037
Reputation: 14018
Consider carefully reformatting the query, as in some cases, Impala crashes with SEGV when the SQL parse itself fails due to simple issues like spaces. If you're running Cloudera, you'll find logs in /run/cloudera-scm-agent/process
on the node that ran the query.
We have resolved these issues by being careful about SQL formatting (which is also just good practice as it make query errors easier to spot), e.g.
SELECT
COUNT(p.id) AS tweet_count,
p.author AS author,
p.profile_image_url AS profile_image_url,
p.screen_name AS screen_name,
concat_ws('/', MIN(p.postday), MIN(p.postmonth), MIN(p.postyear) ) AS creation_date,
p.message AS MESSAGE,
af.followerid AS follower
FROM
post p
INNER JOIN
author_follower af
ON
af.id LIKE IF(p.author = NULL, '', concat(p.author, '%'))
WHERE
p.hashtaglist LIKE 'hashtagtobeused'
GROUP BY
author,
profile_image_url,
screen_name,
MESSAGE,
follower
ORDER BY
CAST(MIN(postyear) AS INT),
CAST(MIN(postmonth) AS INT),
CAST(MIN(postday) AS INT),
CAST(MIN(posthour) AS INT) ASC;
(By the way, I used dbVisualizer to validate and reformat the query syntax -- great tool to consider)
Upvotes: 1