Reputation: 4383
We have a GAE app where the admin needs to upload a CSV file, parse it and store the data in the datastore. The CSV has 48 columns and on average 10,000 rows. We are on free quota currently, given the GAE pricing for Datastore writes I've calculated that we are one upload is resulting in:
(2 + 48) * 10000 = 50,000 << None of the columns are indexed
So we are hitting datastore write quotas quite fast, is there any other workaround to this?
FYI the values must be persisted because the data has to be searchable (exam results).
We plan to do a search by ID column which means at least one index.
Upvotes: 0
Views: 81
Reputation: 6201
Consider to put this data into one/two/three shards, if your IDs are incremental -- put first 2k rows into entity with ID "1", and next 2k into entity "2" and so on. Then when you want to retrieve row with ID 2345 -- you know that you have to read the shard "2" and then lookup exact entity from memory. If will make it extremely cheap, only couple of writes and reads, and keep your shards small enough to keep it fast. Of course this will not work if your rows are big in size.
Upvotes: 0
Reputation: 41099
It should be 500,000 in your formula, but fortunately for you there is no data writes for unindexed properties as Mikhail pointed out.
On the other hand, if the data has to be searchable, you will have to index at least some properties, which will increase your write costs. With all properties unindexed, there is no difference between storing this data in the datastore and keeping it in text file. In fact, reading it from file is cheaper.
Upvotes: 1