Faster write to MySQL using databricks write

Question

I am currently working on a azure date bricks notebook that read files from a storage container into a data frame and then writes all the records to a table in MySQL. The file can have anywhere from 2 million to 10 million rows. I have the following write code in my notebook after using read to populate my data frame.

newFile.repartition(16).write
  .format("jdbc").option("driver", "com.mysql.cj.jdbc.Driver")
  .mode("append")
  .option("url", s"${jdbc_url}")
  .option("dbtable", pathToDBTable)
  .option("user", s"user")
  .option("password", s"pass")
  .save()

I have played around with the partitions and decided to go with 16 because my cluster will have 16 cores. Other than this is there a faster way to insert all this data into my DB using write? Or any other suggested approaches to try within azure data bricks notebooks? It currently takes 10-15 min for 2 million row files.

Faster write to MySQL using databricks write

Answers (1)

Related Questions