Reputation: 336
Spark has connectors for a variety of databases and data stores.
However, what would be required to create a connector for your own custom distributed database. From what I understand, Spark uses Hadoop connectors to fetch data from a distributed data store. I wasn't able to find a good resource to understand how a Hadoop connector works and how one can be made.
I'm looking to understand the semantics of a Hadoop connector so as to be able to create one for my custom database.
Upvotes: 0
Views: 92
Reputation: 39
You have to implement a Record Reader using Java with the Hadoop API
Then Spark will be able to use it
My sugestion is start by reading Tom White's book
Upvotes: 1