whathaha
whathaha

Reputation: 51

How sparksql can batch insert data to mysql?

All. Recently I need to insert about 100,000,000 data into mysql by using sparksql, however the speed is quite low, it takes about 1 hour. Can some one have a efficent way to insert ?

mysql configuration =>

prop.setProperty("user", "user")
prop.setProperty("password", "password")
prop.setProperty("rewriteBatchedStatements", "true")
prop.setProperty("batchsize", "1000000")
prop.setProperty("numPartitions", "3000") 

Upvotes: 5

Views: 5468

Answers (1)

Álvaro Valencia
Álvaro Valencia

Reputation: 1217

Try adding ?rewriteBatchedStatements=true to your MySQL URI. Your URI would be something like it: jdbc:mysql://host:port/db?rewriteBatchedStatements=true

Hope it helps.

Update

Try it, it worked for me. Setting the driver property did the trick.

val prop = new Properties()

prop.setProperty("user", dbUser)
prop.setProperty("password", dbPassword)
prop.put("driver", "com.mysql.jdbc.Driver");

df.write.mode("append").jdbc("jdbc:mysql://" + dbHost + "/" + dbName + "?rewriteBatchedStatements=true", "TABLE_NAME", prop)

And also I had to import the Java MySQL connector.

    <dependency>
        <groupId>mysql</groupId>
        <artifactId>mysql-connector-java</artifactId>
        <version>5.1.39</version>
    </dependency>

Upvotes: 9

Related Questions