patz
patz

Reputation: 1316

FAILED: NullPointerException null in HIVE QUERY

Following is the HIVE query I am using, I am also using a Ranking function. I am running this on my local machine.

SELECT numeric_id, location, Rank(location), followers_count
FROM (
SELECT  numeric_id, location, followers_count
FROM twitter_data
DISTRIBUTE BY numeric_id, location
SORT BY numeric_id, location, followers_count desc
) a
WHERE Rank(location)<10;

My Rank function is as follows:

package org.apache.hadoop.hive.contrib.udaf.ex;

import org.apache.hadoop.hive.ql.exec.UDF;



public final class Rank extends UDF{
    private int  counter;
    private String last_key;
    public int evaluate(final String key){
      if ( !key.equalsIgnoreCase(this.last_key) ) {
         this.counter = 0;
         this.last_key = key;
      }
      return this.counter++;
    }
}

I am creating the Jar of the above file and then doing the following steps before running the hive query. I tried doing it with runnable jar and creating with a simple as well.

ADD JAR /home/adminpc/Downloads/Project_input/Rank.jar;
CREATE TEMPORARY FUNCTION Rank AS 'org.apache.hadoop.hive.contrib.udaf.ex.Rank';

This is what I get to after executing the Hive Query--

hive> SELECT numeric_id, location, Rank(location), followers_count
    > FROM (
    > SELECT  numeric_id, location, followers_count
    > FROM twitter_data
    > DISTRIBUTE BY numeric_id, location
    > SORT BY numeric_id, location, followers_count desc
    > ) a
    > WHERE Rank(location)<1;
FAILED: NullPointerException null

Upvotes: 2

Views: 30252

Answers (1)

WestCoastProjects
WestCoastProjects

Reputation: 63162

Your UDF does not appear to protect against null values in the input table .Specifically: examine what happens when location were null.

Upvotes: 3

Related Questions