Reputation: 109
I have started using spark recently, I have a use case where I need to process the file and store the output ot postgre database. I am able to read the file and process it, I am not able to store the processed data to db. Can some one please suggest how can i save the output to db?
Thanks,,,,
Upvotes: 2
Views: 424
Reputation: 11381
If the database is accessible from all workers node, you can use foreachPartition
to save the output. Pseudocode:
rdd.foreachPartition { records =>
// Connect to the database
records.foreach { r =>
// Loop over records and save
}
// Close the connection to the db
}
Upvotes: 3