Mohit Rane
Mohit Rane

Reputation: 279

hive udf execution via shell script

I have a Hive Udf that works well in hive terminal, What I want i want to execute it via shell script. On hive terminal i am able to execute following commands :

use mashery_db;
add jar hdfs://nameservice1/tmp/nextdata_aggregations/custom_jar/readerCheck.jar;
add file hdfs://nameservice1/tmp/GeoLite2-City.mmdb;
CREATE TEMPORARY FUNCTION geoip AS 'com.mashery.nextdata.hive.udf.GeoIPGenericUDF';

But when I am adding the above code in shell script

hive -e "use mashery_db;"
hive -e "add jar hdfs://nameservice1/tmp/nextdata_aggregations/custom_jar/readerCheck.jar;"
hive -e "add file hdfs://nameservice1/tmp/GeoLite2-City.mmdb;"
hive -e "CREATE TEMPORARY FUNCTION geoip AS 'com.mashery.nextdata.hive.udf.GeoIPGenericUDF';"

The 1st 'hive -e' works well and adds the jar but the last one create temporary function doesn't work. I am getting below error:

FAILED: ParseException line 1:35 mismatched input 'com' expecting StringLiteral near 'AS' in create function statement

I have also tried with single quotes

hive -e "CREATE TEMPORARY FUNCTION geoip AS 'com.mashery.nextdata.hive.udf.GeoIPGenericUDF';"

then I am getting FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.FunctionTask

FAILED: Class com.mashery.nextdata.hive.udf.GeoIPGenericUDF not found

Does hive Udf supports shell script ,if it does whats wrong I am doing. Thanks in advance

Upvotes: 0

Views: 848

Answers (2)

Aaron Faltesek
Aaron Faltesek

Reputation: 349

You can get this to work with both hive -e and hive -f:

hive -e "use mashery_db;
add jar hdfs://nameservice1/tmp/nextdata_aggregations/custom_jar/readerCheck.jar;
add file hdfs://nameservice1/tmp/GeoLite2-City.mmdb;
CREATE TEMPORARY FUNCTION geoip AS 'com.mashery.nextdata.hive.udf.GeoIPGenericUDF';"

Creating them as a file and using hive -f hive_file.hql would work as well.

Upvotes: 0

Roberto Congiu
Roberto Congiu

Reputation: 5213

Each invocation of hive -e spawns a new process with a new hive shell that has no memory of what the previous one did, so hive 'forgets' where the UDF is... One solution is to chain them in just one command, but it's better form to put all your hive commands in a file (for instance "commands.hql") and use hive -f commands.hql instead of -e.

File would look like this:

use mashery_db;
add jar hdfs://nameservice1/tmp/nextdata_aggregations/custom_jar/readerCheck.jar;
add file hdfs://nameservice1/tmp/GeoLite2-City.mmdb;
CREATE TEMPORARY FUNCTION geoip AS 'com.mashery.nextdata.hive.udf.GeoIPGenericUDF';"

Upvotes: 1

Related Questions