Reputation: 51
All. Recently I need to insert about 100,000,000 data into mysql by using sparksql, however the speed is quite low, it takes about 1 hour. Can some one have a efficent way to insert ?
mysql configuration =>
prop.setProperty("user", "user")
prop.setProperty("password", "password")
prop.setProperty("rewriteBatchedStatements", "true")
prop.setProperty("batchsize", "1000000")
prop.setProperty("numPartitions", "3000")
Upvotes: 5
Views: 5468
Reputation: 1217
Try adding ?rewriteBatchedStatements=true
to your MySQL URI. Your URI would be something like it: jdbc:mysql://host:port/db?rewriteBatchedStatements=true
Hope it helps.
Try it, it worked for me. Setting the driver
property did the trick.
val prop = new Properties()
prop.setProperty("user", dbUser)
prop.setProperty("password", dbPassword)
prop.put("driver", "com.mysql.jdbc.Driver");
df.write.mode("append").jdbc("jdbc:mysql://" + dbHost + "/" + dbName + "?rewriteBatchedStatements=true", "TABLE_NAME", prop)
And also I had to import the Java MySQL connector.
<dependency>
<groupId>mysql</groupId>
<artifactId>mysql-connector-java</artifactId>
<version>5.1.39</version>
</dependency>
Upvotes: 9