themightyhulk
themightyhulk

Reputation: 175

Hive query - FAILED SemanticException invalid path

Here is my problem:

I have just gotten my initial Azure subscription converted to a Pay-As-You-Go subscription (first was a 30-day trial) after it was shut down when I used up the first set of free credits. Now all is working fine again - I still have the same old resource group under which I establish a new cluster. The files with my CSV-data are all still present in the container I created last time (not the default container but one that was established earlier). The only thing I had to recreate was the Hive table needed to load the data into. Also that table I was able to establish again. However when I then try to run a Hive query to actually load data into the Hive table from the CSV-file as follows...

LOAD DATA INPATH '/container1/HdiSamples/user/data-file.csv' OVERWRITE INTO TABLE default.hive_table;

...I am constantly receiving "Failed" as an error message (I use Data Lake tools for VS to upload blobs and run the queries). In the specificerror log the line beginning with 'FAILED: SemanticException etc.' stands out each time... (this despite of using different locations for the file upload).

16/12/01 04:16:25 WARN conf.HiveConf: HiveConf of name hive.log.dir does not exist FAILED: SemanticException Line 1:17 Invalid path ''/container1/HdiSamples/user/data-file.csv'': No files matching path wasb://[email protected]/container1/HdiSamples/user/data-file.csv

Here is my question:

Can anyone tell me why it doesn't find and load the file at/from the location where the file actually resides...?

I just don't get the cause for this error...

Upvotes: 3

Views: 1690

Answers (1)

themightyhulk
themightyhulk

Reputation: 175

Although it's been a while since I asked this question, I worked out a solution to the issue myself which I thought, I'd share with others...

I had problems for about a week, being unable to load data into the Hive tables from the Azure Blob Storage. I had two CSV-files called data-file.csv and data-file-extended-1.CSV in my blob. Please note the capitals in the file extension here!

Hive and Hadoop do NOT accept these files unless... a) the filename is spelled exactly the same way including the capitals in the file-extension AND b) the filename is shortened drastically and without the hyphens and numbers (in my case I used only 6 conjoined letters, i.e. "datfil" and "datfix")

Shockingly, there isn't any mention of these issues in neither the official Azure documentation nor did I find anything on the web. However, these two adjustments will resolve the error message.

Just to let people know...

Upvotes: 2

Related Questions