Reputation: 83
Please help me understand what would be the best way to save output of spark javaRDD
into database?
Should I write spark java code to save RDD
into database? What would be drawback of this approach ?
Or I should use sqoop
to save output files into database?
Is there any other way to to this?
Thanks
Upvotes: 1
Views: 1640
Reputation: 83
used dataframe and saved data into sql server
SQLContext sqlcontext=new SQLContext(context);
DataFrame outDataFrame=sqlcontext.createDataFrame(finalOutPutRDD, WebHttpOutPutVO.class);
Properties prop = new java.util.Properties();
prop.setProperty("database", "Web_Session");
prop.setProperty("user", "user");
prop.setProperty("password", "pwd@123");
prop.setProperty("driver", "com.microsoft.sqlserver.jdbc.SQLServerDriver");
outDataFrame.write().mode(org.apache.spark.sql.SaveMode.Append).jdbc("jdbc:sqlserver://<Host>:1433", "test_table", prop);
Upvotes: 2
Reputation: 14598
There are two approaches you can use for writing your results back to the database.
Use something like DBOutputFormat and configure that
Use foreachPartition on the RDD you want to save and pass in a function which creates a connection to MySQL and writes the result back.
Upvotes: 0