Reputation: 161
I'm using the source command in Hive to run an external file that contains a number of Hive UDFS (Plain SQL, like date transformations). The external file is universal for many scripts; thus, easier to maintain outside of individual scripts.
So, if I have
source /tmp/udfs.hql;
select * from tmp1
and run in from the command line, i.e.
hive -e "......."
it works fine.
Of course, if I try to do it in Oozie or a non CLI client, it fails, as source is a CLI command.
Now the question is: How do I replicate this functionality outside of CLI? In other terms, how to execute the source command in a hive query?
Upvotes: 0
Views: 3352
Reputation: 9067
Clumsy workaround:
Code sample for the shell :
typeset CurrentJobInfo CurrentJobId TargetHiveScript
if [[ "$CONTAINER_ID" != "" && "$OOZIE_ACTION_CONF_XML" != "" ]]
then
CurrentJobInfo=$(/bin/sed -n '/<name>mapreduce.job.name<\/name>/ { N ; s/^.*<value>oozie:action:/:/ ; s/<\/value>.*$/:/ ; p}' "$OOZIE_ACTION_CONF_XML")
CurrentJobId=$(/bin/echo "$CurrentJobInfo" | /bin/sed -n '/:ID=[^:]*:/ { s/^.*ID=// ; s/:.*$// ; p }')
fi
if [[ "$CurrentJobId" == "" ]]
then
/bin/echo "ERROR - could not find Oozie Job ID in expected XML config file" 1>&2
exit 255
fi
TargetHiveScript="/user/johndoe/temp/${CurrentJobId}-DummyHiveAction.hql"
# all these ".hql" scripts are assumed to be available in the CWD thanks to <file> elements in Oozie Shell Action
/bin/cat common.hql common.DummyApp.hql DummyHiveAction.hql | /usr/bin/hdfs dfs -put -f - "$TargetHiveScript"
if [[ $? -ne 0 ]]
then
/bin/echo "ERROR - could not upload Hive script" 1>&2
exit 255
fi
exit 0
In the Hive Action the reference to that file should be
<script>/user/johndoe/temp/${wf:id()}-DummyHiveAction.hql<script>
PS: I did not test all that end-to-end, just did some copy/paste/edit from bits of code that run on our site. The debugging is all yours :-)
Upvotes: 1