Reputation: 41
We need a process in place to pull data from Hadoop Distributed File System (HDFS) to a relational DB (PostgreSQL) on a regular basis. We will need to transfer several million records per hour and I am looking for the best industry standards to move data out of HDFS. Does any one have any suggestions? The idea is for a web app to interact with PostgreSQL which will have aggregated data.
Upvotes: 4
Views: 3100
Reputation: 39913
Sqoop is built for the purpose of moving data between relational data stores and Hadoop. Specifically, you want sqoop-export.
Upvotes: 3