Alan Ivey
Alan Ivey

Reputation: 947

Slow importing into Google Cloud SQL

Google Cloud SQL is my first real evaluation at MySQL as a service. I created a D32 instance, set replication to async, and disabled binary logging. Importing 5.5 GB from dump files from a GCE n1-standard-1 instance in the same zone took 97 minutes.

Following the documentation, the connection was done using the public IP address, but is in the same region and zone. I'm fully open to the fact that I did something incorrectly. Is there anything immediately obvious that I should be doing differently?

Upvotes: 2

Views: 2196

Answers (1)

koma
koma

Reputation: 6566

we have been importing ~30Gb via cloud storage from zip files containing SQL statements and this is taking over 24Hours. A big factor is the number of indexes that you have on the given table.

To keep it manageable, we split the file into chunks with each 200K sql statements which are being inserted in one transaction. This enables us to retry individual chunks in case of errors.

We also tried to do it via compute engine (mysql command line) and in our experience this was even slower.

Here is how to import 1 chunk and wait for it to complete. You cannot do this in parallel as cloudSql only allows for 1 import operation at a time.

#!/bin/bash

function refreshAccessToken() {
    echo "getting access token..."
    ACCESSTOKEN=`curl -s "http://metadata/computeMetadata/v1/instance/service-accounts/default/token" -H "X-Google-Metadata-Request: True" | jq ".access_token" | sed 's/"//g'`
    echo "retrieved access token $ACCESSTOKEN"
}

START=`date +%s%N`

DB_INSTANCE=$1
GCS_FILE=$2
SLEEP_SECONDS=$3

refreshAccessToken

CURL_URL="https://www.googleapis.com/sql/v1beta1/projects/myproject/instances/$DB_INSTANCE/import"
CURL_OPTIONS="-s --header 'Content-Type: application/json' --header 'Authorization: OAuth $ACCESSTOKEN' --header 'x-goog-project-id:myprojectId' --header 'x-goog-api-version:1'"
CURL_PAYLOAD="--data '{ \"importContext\": {  \"database\": \"mydbname\", \"kind\": \"sql#importContext\", \"uri\": [ \"$GCS_FILE\" ]}}'"
CURL_COMMAND="curl --request POST $CURL_URL $CURL_OPTIONS $CURL_PAYLOAD"

echo "executing $CURL_COMMAND"

CURL_RESPONSE=`eval $CURL_COMMAND`
echo "$CURL_RESPONSE"
OPERATION=`echo $CURL_RESPONSE | jq ".operation" | sed 's/"//g'`

echo "Import operation $OPERATION started..."
CURL_URL="https://www.googleapis.com/sql/v1beta1/projects/myproject/instances/$DB_INSTANCE/operations/$OPERATION"
STATE="RUNNING"
while [[ $STATE == "RUNNING" ]]
do
        echo "waiting for $SLEEP_SECONDS seconds for the import to finish..."
        sleep $SLEEP_SECONDS
        refreshAccessToken
        CURL_OPTIONS="-s --header 'Content-Type: application/json' --header 'Authorization: OAuth $ACCESSTOKEN' --header 'x-goog-project-id:myprojectId' --header 'x-goog-api-version:1'"
        CURL_COMMAND="curl --request GET $CURL_URL $CURL_OPTIONS"
        CURL_RESPONSE=`eval $CURL_COMMAND`
        STATE=`echo $CURL_RESPONSE | jq ".state" | sed 's/"//g'`
        END=`date +%s%N`
        ELAPSED=`echo "scale=8; ($END - $START) / 1000000000" | bc`
        echo "Import process $OPERATION for $GCS_FILE : $STATE, elapsed time $ELAPSED"
done

Upvotes: 2

Related Questions