Is there a way to use a JDBC as a input resource for Hadoop's MapReduce?

Question

I have data in a PostgreSQL DB and I'd like to get it, treat it and save it to a HBase DB. Is it possible to distribute somehow the JDBC operation in a Map operation?

twid · Accepted Answer

Yes you can do that by DBInputFormat:

DBInputFormat uses JDBC to connect to data sources. Because JDBC is widely implemented, DBInputFormat can work with MySQL, PostgreSQL, and several other database systems. Individual database vendors provide JDBC drivers to allow third-party applications (like Hadoop) to connect to their databases.

The DBInputFormat is an InputFormat class that allows you to read data from a database. An InputFormat is Hadoop’s formalization of a data source; it can mean files formatted in a particular way, data read from a database, etc. DBInputFormat provides a simple method of scanning entire tables from a database, as well as the means to read from arbitrary SQL queries performed against the database.

LINK

Is there a way to use a JDBC as a input resource for Hadoop's MapReduce?

Answers (2)

Related Questions

Is there a way to use a JDBC as a input resource for Hadoop&#39;s MapReduce?

Answers (2)

Related Questions

Is there a way to use a JDBC as a input resource for Hadoop's MapReduce?