QPeiran
QPeiran

Reputation: 1495

Bulk insert to Cloud SQL by appscript using JDBC executeBatch() consuming too much time

I have some raw data collected and stored in a google spreadsheet, and there's also an existing Google Cloud SQL instance. I am trying to use appscript to pull data from gsheet then push to gcloud SQL.

Unfortunately, I found it took too much time to finish the "bulk insert". Here is my method and results (in this example, I didn't show the way to pull data from gsheet because it is fast and irrelevant):

appscript:

   var connection = [My Connection];
   connection.setAutoCommit(false);
   var stmt = connection.prepareStatement('INSERT INTO [testTable]'
      + '(emp_no,title,from_date,to_date) values (?, ?, ?, ?)');
   for (var i = 1; i <= 50; i++) { //tuples counter i
    stmt.setString(1, 1);
    stmt.setString(2, "Worker" + i);
    stmt.setString(3, "2018-03-11");
    stmt.setString(4, "2019-05-04");
    stmt.addBatch();
  }
  stmt.executeBatch();
  connection.commit();

Simple code and here is my result (from the "Execution Transripts"):

When the tuples counter i is less equal to 50:

[19-08-12 13:57:46:470 NZST] JdbcPreparedStatement.executeBatch() [9.978 seconds]

When the tuples counter i is less equal to 500:

[19-08-12 14:10:23:575 NZST] JdbcPreparedStatement.executeBatch() [96.578 seconds]

What I want to do is to pull and push 5000 tuples. How can I reduce the execution time in this scenario?

Upvotes: 3

Views: 760

Answers (2)

Arthur Noort
Arthur Noort

Reputation: 175

I had the same issue and decided to use a script property to create batches of 250 rows and run the script multiple times a day. It's not pretty but it saves the hassle of setting up alternative environments.

Something like this:

    //get count variable to get what row number to start with
      const scriptProperties = PropertiesService.getScriptProperties();
      const countStart = parseInt(scriptProperties.getProperty('Count'));
      const countEnd = (countStart + 250) < lastRow ? (countStart + 250) : lastRow;
    
      //if all rows are already processed then don't do anything
      if (countStart != lastRow) {
    

      const conn = Jdbc.getCloudSqlConnection('xxxx', 'xxxx', 'xxxx');
      conn.setAutoCommit(false);
      const stmt = //statement here

      //loop through data to create batches here

      }

      

Upvotes: 0

TheAddonDepot
TheAddonDepot

Reputation: 8964

Google Apps Script's JDBC connector is notoriously slow.

You may have to forgo using it altogether and leverage something else instead.

If you know your way around Node.js then you might want to consider using a Cloud Function as an intermediary service to push and pull data to and from your sheets and CloudSQL database.

Upvotes: 5

Related Questions