Reputation: 10539
Uploading to GCE from a pod inside GKE takes really long. I hoped the upgrade to kubernetes 1.1 would help, but it didn't. It is faster, but not as fast as it should be. I made some benchmarks, uploading a single file with 100MiB:
docker 1.7.2 local
took {20m51s240ms}, that's about ~{0.07993605115907274}MB/s
docker 1.8.3 local
took {3m51s193ms}, that's about ~{0.4329004329004329}MB/s
docker 1.9.0 local
took {3m51s424ms}, that's about ~{0.4329004329004329}MB/s
kubernetes 1.0
took {1h10s952ms}, that's about ~{0.027700831024930747}MB/s
kubernetes 1.1.2 (docker 1.8.3)
took {32m11s359ms}, that's about ~{0.05178663904712584}MB/s
As you can see the thruput doubles with kubernetes 1.1.2, but is still really slow. If I want to upload 1GB I have to wait for ~5 hours, this can't be the expected behaviour. GKE runs inside the Google infrastructure, so I expect that it should be faster or at least as fast as uploading from local.
I also noted a very high CPU load (70%) while uploading. It was tested with a n1-highmem-4
machine-type and a single RC/pod that was doing nothing then the upload.
I'm using the java client with the GAV coordinates com.google.appengine.tools:appengine-gcs-client:0.5
The relevant code is as follows:
InputStream inputStream = ...; // 100MB RandomData from RAM
StorageObject so = new StorageObject().setContentType("text/plain").setName(objectName);
AbstractInputStreamContent content = new InputStreamContent("text/plain", inputStream);
Stopwatch watch = Stopwatch.createStarted();
storage.objects().insert(bucket.getName(), so, content).execute();
watch.stop();
Copying a 100MB file using a manually installed gcloud with gsutil cp
took nearly no time (3 seconds). So it might be an issue with the java-library? The question still remains, how to improve the upload time using the java-library?
Upvotes: 1
Views: 2279
Reputation: 10539
Solution is to enable "DirectUpload", so instead of writing
storage.objects().insert(bucket.getName(), so, content).execute();
you have to write:
Storage.Objects.Insert insert = storage.objects().insert(bucket.getName(), so, content);
insert.getMediaHttpUploader().setDirectUploadEnabled(true);
insert.execute();
Performance I get with this solution:
JavaDoc for the setDirectUploadEnabled
:
Sets whether direct media upload is enabled or disabled.
If value is set to true then a direct upload will be done where the whole media content is uploaded in a single request. If value is set to false then the upload uses the resumable media upload protocol to upload in data chunks.
Direct upload is recommended if the content size falls below a certain minimum limit. This is because there's minimum block write size for some Google APIs, so if the resumable request fails in the space of that first block, the client will have to restart from the beginning anyway.
Defaults to false.
Upvotes: 1
Reputation: 10677
The fact that you're seeing high CPU load and that the slowness only affects Java and not the Python gsutil
is consistent with the slow AES GCM issue in Java 8. The issue is fixed in Java 9 using appropriate specialized CPU instructions.
If you have control over it, then either using Java 7 or adding jdk.tls.disabledAlgorithms=SSLv3,GCM
to a file passed to java -Djava.security.properties
should fix the slowness as explained in this answer to the general slow AES GCM question.
Upvotes: 0