DurhamG
DurhamG

Reputation: 222

How can I speed up the App Engine bulk downloader?

I'm trying to use the App Engine bulkloader to download entities from the datastore (the high-replication one if it matters). It works, but it's quite slow (85KB/s). Are there some magical set of parameters I can pass it to make it faster? I'm receiving about 5MB/minute or 20,000 records/minute, and given that my connection can do 1MB/second (and hopefully App Engine can serve faster than that) there must be a way to do it faster.

Here's my current command. I've tried high numbers, low numbers, and every permutation:

appcfg.py download_data 
--application=xxx 
--url=http://xxx.appspot.com/_ah/remote_api 
--filename=backup.csv 
--rps_limit=30000 
--bandwidth_limit=100000000 
--batch_size=500 
--http_limit=32
--num_threads=30 
--config_file=bulkloader.yaml 
--kind=foo

I already tried this App Engine Bulk Loader Performance and it's no faster than what I already have. The number's he mentions are on par with what I'm seeing as well.

Thanks in advance.

Upvotes: 3

Views: 478

Answers (1)

Shay Erlichmen
Shay Erlichmen

Reputation: 31928

Did you set an index on the key of the entity your trying to download?
I don't know if that helps but check if you get a warning at the beginning of the download that says something about "using sequential download"

Put this on the index.yaml to create an index on the entity key upload and wait for the index to be built.

- kind: YOUR_ENTITY_TYPE
  properties:
  - name: __key__
    direction: desc

Upvotes: 3

Related Questions