Yasser
Yasser

Reputation: 1808

Download google app engine entities as CSV

In our GAE app, users often need to download entities of a particular type as CSV. New entities are frequently being added/updated which makes it infeasible to write all entities to a blob in advance or at fixed intervals.

For around 50000 entities (each entity < 2 KB) fetched in batches of 500 it takes over 2 minutes to write to a CSV file blob and costs nearly $1. Also users have to wait for a long time to receive a file which is usually only a few (< 5) MBs.

I have 2 questions:

A) Can the time to write the blob possibly be reduced by configuring a map-reduce pipeline for the export?

B) Is there a way to reduce the cost of fetching large number of entities from the datastore and writing to blobs?

Edit: Just learned that mapreduce can only run on all entities of a kind, not a filtered subset. So mapreduce will probably increase the cost by a whole lot. Any other suggestions?

Upvotes: 2

Views: 337

Answers (1)

Shay Erlichmen
Shay Erlichmen

Reputation: 31928

  1. You should use AppEngine pipeline it can improve the speed because it will distributed the job across several instances.
  2. You can reduce the cost of fetching entities by using projection queries in which you specify which attributes you want to fetch.
  3. As for the download speed, are you using the blobstore with BlobstoreDownloadHandler?

Upvotes: 1

Related Questions