Debugger
Debugger

Reputation: 564

Best way to import data from MySql to HDFS

I need to know is there any way to import data from mysql to HDFS, there are some conditions I need to mention.

I need to know best way to import mysql data into HDFS and update in real time.

Upvotes: 0

Views: 2193

Answers (3)

Vijay Innamuri
Vijay Innamuri

Reputation: 4372

Yes, you can access the database and HDFS via JDBC connectors and hadoop Java API.

But in map-reduce things will be out of your control when accessing a database.

  • Each mapper/reducer tries to establish a separate connection to database, eventually impacts the database performance.
  • There won't be any clue which mapper/reducer executes what portion of the query result set.
  • Incase if there is a single mapper/reducer to access the database then hadoop parallelism will be lost.
  • Fault tolerant mechanism has to be implemented if any of the mapper/reducer is failed.
  • list goes on......

To overcome all these hurdles, Sqoop was developed to transfer data between RDBMS to/from HDFS.

Upvotes: 0

Nishant
Nishant

Reputation: 74

You can use Real Time import using CDC and Talend. http://www.talend.com/talend-big-data-sandbox

Upvotes: 0

Arnon Rotem-Gal-Oz
Arnon Rotem-Gal-Oz

Reputation: 25909

Why don't you want to use sqoop - it does what you would have to do (open a JDBC connection get data , write to hadoop) see this presentation from hadoop world 09

Upvotes: 2

Related Questions