coder
coder

Reputation: 83

spark javaRDD output to database

Please help me understand what would be the best way to save output of spark javaRDD into database?

Should I write spark java code to save RDD into database? What would be drawback of this approach ?

Or I should use sqoop to save output files into database?

Is there any other way to to this?

Thanks

Upvotes: 1

Views: 1640

Answers (2)

coder
coder

Reputation: 83

used dataframe and saved data into sql server

SQLContext sqlcontext=new SQLContext(context);
DataFrame outDataFrame=sqlcontext.createDataFrame(finalOutPutRDD, WebHttpOutPutVO.class);
Properties prop = new java.util.Properties();
prop.setProperty("database", "Web_Session");
prop.setProperty("user", "user");
prop.setProperty("password", "pwd@123");
prop.setProperty("driver", "com.microsoft.sqlserver.jdbc.SQLServerDriver");
outDataFrame.write().mode(org.apache.spark.sql.SaveMode.Append).jdbc("jdbc:sqlserver://<Host>:1433", "test_table", prop);

Upvotes: 2

S M Abrar Jahin
S M Abrar Jahin

Reputation: 14598

There are two approaches you can use for writing your results back to the database.

  1. Use something like DBOutputFormat and configure that

  2. Use foreachPartition on the RDD you want to save and pass in a function which creates a connection to MySQL and writes the result back.

Upvotes: 0

Related Questions