Reputation: 148
I am want to transform a hive table (hdfs spot instances) using a Python UDF for which I need an external library "user-agents". My udf without the use of external library is working fine. But I am not able to get things working when I want to use it.
I tried installing the library using the code itself given below.
import sys
import subprocess
import pip
import os
sys.stdout = open(os.devnull, 'w+')
pip.main(['install', '--user', 'pyyaml'])
pip.main(['install', '--user', 'ua-parser'])
pip.main(['install', '--user', 'user-agents'])
sys.stdout = sys.__stdout__
and after this I tried this
import user_agents
but the udf is crashing with an exception "No module found". I also tried checking the following paths through code :
/usr/local/lib/python2.7/site-packages
/usr/local/lib64/python2.7/site-packages
But no user_agents module was there. Any help on how to do it to get things working ? Would really appreciate it. Thanks !
Upvotes: 1
Views: 1148
Reputation: 148
I figured a way out of this. For those who are solving this same UDF issue and are not successful yet can possibly try this solution and check if it works for them too.
For external libraries, do the following steps:
Step 1: Force pip to install the external library through code itself to the current working directory of your UDF.
import sys
import os
import pip
sys.stdout = open(os.devnull, 'w+')
pip.main(['install', 'user-agents', '-t', os.getcwd(), '--ignore-installed'])
sys.stdout = sys.__stdout__
Step 2: Update your sys.path
sys.path.append(os.getcwd())
Step 3: Now import the library :)
from user_agents import parse
That's it. Please check and confirm it this works for you too.
Upvotes: 2