Muhammad Bekette
Muhammad Bekette

Reputation: 1434

Impala query cannot retrieve result with NullPointerException

I have the following query that I run over hive/impala:

select count(p.id) as tweet_count, p.author as author,p.profile_image_url as profile_image_url,p.screen_name as screen_name,
concat_ws('/',min(p.postday),min(p.postmonth),min(p.postyear) ) as creation_date,p.message message,af.followerid as follower 
from post p 
inner join author_follower af on af.id like if(p.author= null, '', concat(p.author,'%'))
where p.hashtaglist like 'hashtagtobeused' 
group by author,profile_image_url,screen_name,message,follower
ORDER BY cast(min(postyear) as int),cast(min(postmonth) as int),cast(min(postday) as int),cast(min(posthour) as int) ASC;

but for some reason I get the following error result

Your query has the following error(s):

Bad status for request 3304: TGetOperationStatusResp(status=TStatus(errorCode=None, errorMessage=None, sqlState=None, infoMessages=None, statusCode=0), operationState=5, errorMessage=None, sqlState=None, errorCode=None)

I checked the query and I cannot find a problem with it, can anyone please help and guide to where the problem is?why do I have this error instead of result set

Upvotes: 1

Views: 1037

Answers (1)

Tom Harrison
Tom Harrison

Reputation: 14018

Consider carefully reformatting the query, as in some cases, Impala crashes with SEGV when the SQL parse itself fails due to simple issues like spaces. If you're running Cloudera, you'll find logs in /run/cloudera-scm-agent/process on the node that ran the query.

We have resolved these issues by being careful about SQL formatting (which is also just good practice as it make query errors easier to spot), e.g.

SELECT
    COUNT(p.id)                                                     AS tweet_count,
    p.author                                                        AS author,
    p.profile_image_url                                             AS profile_image_url,
    p.screen_name                                                   AS screen_name,
    concat_ws('/', MIN(p.postday), MIN(p.postmonth), MIN(p.postyear) ) AS creation_date,
    p.message                                                       AS MESSAGE,
    af.followerid                                                   AS follower
FROM
    post p
INNER JOIN
    author_follower af
ON
    af.id LIKE IF(p.author = NULL, '', concat(p.author, '%'))
WHERE
    p.hashtaglist LIKE 'hashtagtobeused'
GROUP BY
    author,
    profile_image_url,
    screen_name,
    MESSAGE,
    follower
ORDER BY
    CAST(MIN(postyear) AS INT),
    CAST(MIN(postmonth) AS INT),
    CAST(MIN(postday) AS INT),
    CAST(MIN(posthour) AS INT) ASC;

(By the way, I used dbVisualizer to validate and reformat the query syntax -- great tool to consider)

Upvotes: 1

Related Questions