Reputation: 2737
I am using hive and a python udf. I defined a sql file in which I added the python udf and I call it. So far so good and I can process on my query results using my python function. However, at this point of time, I have to use an external .txt file in my python udf. I uploaded that file into my cluster (the same directory as .sql and .py file) and I also added that in my .sql file using this command:
ADD FILE /home/ra/stopWords.txt;
When I call this file in my python udf as this:
file = open("/home/ra/stopWords.txt", "r")
I got several errors. I cannot figure out how to add nested files and using them in hive.
any idea?
Upvotes: 4
Views: 1339
Reputation: 1182
All added files are located in the current working directory (./
) of UDF script.
If you add a single file using ADD FILE /dir1/dir2/dir3/myfile.txt
, its path will be
./myfile.txt
If you add a directory using ADD FILE /dir1/dir2
, the file's path will be
./dir2/dir3/myfile.txt
Upvotes: 2