Reputation: 381

What is the most efficient way to use datastore in google appengine?

Currently I am working on project that's supposed to be used by a lot of people. I am worried that datastore read/write/small ops are using too much resources. Since I am new to appengine, are there any efficient ways to make those numbers smaller? I thought about using memcache but it's not most secure way. Also is it a good idea to search for datastore entities using the:

SELECT __key__ FROM table

then use:

....#code
table.get_by_id(entity.id())
....#code

Thank you very much.

Upvotes: 0

Answers (4)

Nick Johnson

Reputation: 101149

No, there's no reason to do a keys only query, then fetch the entities separately, unless you only want to retrieve some of the entities identified by the returned keys. If this were more efficient, the datastore would do it for you. Just do a regular query.

Upvotes: 1

Rob Boyle

Reputation: 430

Since it looks like you are using python, I would highly recommend using the new datastore API, NDB.

NDB automatically uses memcache to cache it's models behind the scenes without any extra work on your part. Granted you should also look at using memcache manually, ndb isn't a silver bullet. But it'll help you for free, which is always nice.

Beyond the performance gains it's a cleaner interface to the app engine datastore. It also has clean support for bulk operations which can also be a performance boost.

Upvotes: 8

Dan Sanderson

Reputation: 2111

For data that is read frequently and written not so frequently, use the memcache in front of the datastore. When you read, first check whether the data is in memcache, and if not, read it from the datastore then store it in memcache for future reads. In the simple case where you're reading entities by key, you can just store each entity by its datastore key in the memcache. For queries, you'll have to decide whether it's worth storing the result set in memcache, keyed by the query parameters.

When you write, you can delete the memcache value and it'll be reloaded on the next read. You have to live with the possibility that the delete will fail. Typically, you set an expiration time on the memcache value so an old value doesn't stick around very long. Notice that for many reads per second, even a short expiration time gets you a significant performance gain.

You can also use memcache in the same way with other time-consuming data operations, like URL Fetch, or calculated values (e.g. complex templated text). In all of these cases, if the memcache value has been evicted, you fall back to the primary source, so you gain read performance without losing access to the data.

Other performance tips: Use batch calls when possible to reduce the number of RPCs. Use asynchronous calls when possible so your app is not blocking on service calls when it could be doing something else. Use AppStats to visualize your service calls and find areas where asynchronous calls might help.

Regarding your fetch-by-key question: In general, doing a keys-only query then immediately fetching the result entities by key doesn't help, because that's all a full-entity query does anyway. But if you need to fetch selectively from the results of the query, or if it makes sense to query for keys in one place and fetch in another, those are possibilities, and you don't lose much. I often find uses for keys-only queries. See also projection queries, for fetching only a subset of (indexed) properties.

Upvotes: 8

dkamins

Reputation: 21938

The most efficient want to use the Datastore in Google App Engine is... to not use the Datastore! It's slow.

Use memcache wherever possible. What is "insecure" about memcache? That's certainly an unusual critique.

Also if you know the key or id of your entity, just load it directly e.g. with get_by_key_name.

Upvotes: 2

What is the most efficient way to use datastore in google appengine?

Answers (4)

Related Questions