Reputation: 1808
In our GAE app, users often need to download entities of a particular type as CSV. New entities are frequently being added/updated which makes it infeasible to write all entities to a blob in advance or at fixed intervals.
For around 50000 entities (each entity < 2 KB) fetched in batches of 500 it takes over 2 minutes to write to a CSV file blob and costs nearly $1. Also users have to wait for a long time to receive a file which is usually only a few (< 5) MBs.
I have 2 questions:
A) Can the time to write the blob possibly be reduced by configuring a map-reduce pipeline for the export?
B) Is there a way to reduce the cost of fetching large number of entities from the datastore and writing to blobs?
Edit: Just learned that mapreduce can only run on all entities of a kind, not a filtered subset. So mapreduce will probably increase the cost by a whole lot. Any other suggestions?
Upvotes: 2
Views: 337
Reputation: 31928
Upvotes: 1