chbh
chbh

Reputation: 336

How does a Hadoop or Spark connector for distributed data stores function?

Spark has connectors for a variety of databases and data stores.

However, what would be required to create a connector for your own custom distributed database. From what I understand, Spark uses Hadoop connectors to fetch data from a distributed data store. I wasn't able to find a good resource to understand how a Hadoop connector works and how one can be made.

I'm looking to understand the semantics of a Hadoop connector so as to be able to create one for my custom database.

Upvotes: 0

Views: 92

Answers (1)

Javier Bañez
Javier Bañez

Reputation: 39

You have to implement a Record Reader using Java with the Hadoop API

Then Spark will be able to use it

My sugestion is start by reading Tom White's book

Upvotes: 1

Related Questions