Reputation: 11
I am trying to create a process that hits Hadoop and extracts data to my local windows machine. I successfully created on ODBC and was able to test the connection. Researching further I found that I needed to use Microsoft Hive odbc, and I have not been able to get a successful test on the connection. I am open to using different tools, but would like some input on the best way to accomplish what I am trying to do. The data that I am looking for also exists on an ftp server and has been loaded to Hadoop, I could get it from the ftp server but would rather pull it from Hadoop. I am brand new to Hadoop and I have researched and read, but have not been able to find a solution. I know the solution is there, I am just not looking in the right place, could someone please point me in the right direction?
Upvotes: -1
Views: 1190
Reputation: 191701
hits Hadoop and extracts data to my local windows machine
First suggestion: Apache Spark
I successfully created on ODBC and was able to test the connection
Hadoop does not provide ODBC... Hive does
Researching further I found that I needed to use Microsoft Hive odbc
Is your data in Azure? That's the only reason you'd be using a Microsoft driver, as far as I can tell
would like some input on the best way to accomplish what I am trying to do
That much is unclear... You've mentioned SQL tools so far, which isn't accessible over ODBC...
If you are storing data in Hive, JDBC/ODBC will work fine, but Spark would be quicker if you decide to run it on a YARN cluster that would be within Hadoop.
I could get it from the ftp server but would rather pull it from Hadoop
Personally, I would not recommend you get it from Hadoop
Second suggestion: If you are dead-set on using a tool within the Hadoop ecosystem, but not explicitly HDFS, try Apache Nifi project which provides a GetFTP processor.
Upvotes: 1