Brandon
Brandon

Reputation: 3136

Understanding Datastore Get RPCs in Google App Engine

I'm using sharded counters (https://cloud.google.com/appengine/articles/sharding_counters) in my GAE application for performance reasons, but I'm having some trouble understanding why it's so slow and how I can speed things up.

Background
I have an API that grabs a set of 20 objects at a time and for each object, it gets a total from a counter to include in the response.

Metrics
With Appstats turned on and a clear cache, I notice that getting the totals for 20 counters makes 120 RPCs by datastore_v3.Get which takes 2500ms.

Thoughts
This seems like quite a lot of RPC calls and quite a bit of time for reading just 20 counters. I assumed this would be faster and maybe that's where I'm wrong. Is it supposed to be faster than this?

Further Inspection
I dug into the stats a bit more, looking at these two lines in the get_count method:

all_keys = GeneralCounterShardConfig.all_keys(name)
for counter in ndb.get_multi(all_keys):

If I comment out the get_multi line, I see that there are 20 RPC calls by datastore_v3.Get totaling 185ms.

As expected, this leaves get_multi to be the culprit for 100 RPC calls by datastore_v3. Get taking upwards of 2500 ms. I verified this, but this is where I'm confused. Why does calling get_multi with 20 keys cause 100 RPC calls?

Update #1
I checked out Traces in the GAE console and saw some additional information. They show a breakdown of the RPC calls there as well - but in the sights they say to "Batch the gets to reduce the number of remote procedure calls." Not sure how to do that outside of using get_multi. Thought that did the job. Any advice here?

Update #2
Here are some screen shots that show the stats I'm looking at. The first one is my base line - the function without any counter operations. The second one is after a call to get_count for just one counter. This shows a difference of 6 datastore_v3.Get RPCs.

Base Line enter image description here

After Calling get_count On One Counter enter image description here

Update #3
Based on Patrick's request, I'm adding a screenshot of info from the console Trace tool. enter image description here

Upvotes: 2

Views: 395

Answers (2)

Evan Jones
Evan Jones

Reputation: 406

When I've dug into similar issues, one thing I've learned is that get_multi can cause multiple RPCs to be sent from your application. It looks like the default in the SDK is set to 1000 keys per get, but the batch size I've observed in production apps is much smaller: something more like 10 (going from memory).

I suspect the reason it does this is that at some batch size, it actually is better to use multiple RPCs: there is more RPC overhead for your app, but there is more Datastore parallelism. In other words: this is still probably the best way to read a lot of datastore objects.

However, if you don't need to read the absolute most current value, you can try setting the db.EVENTUAL_CONSISTENCY option, but that seems to only be available in the older db library and not in ndb. (Although it also appears to be available via the Cloud Datastore API).

Details

If you look at the Python code in the App Engine SDK, specifically the file google/appengine/datastore/datastore_rpc.py, you will see the following lines:

max_count = (Configuration.max_get_keys(config, self.__config) or
             self.MAX_GET_KEYS)
...

if is_read_current and txn is None:
  max_egs_per_rpc = self.__get_max_entity_groups_per_rpc(config)
else:
  max_egs_per_rpc = None

...

pbsgen = self._generate_pb_lists(indexed_keys_by_entity_group,
                                 base_req.ByteSize(), max_count,
                                 max_egs_per_rpc, config)

rpcs = []
for pbs, indexes in pbsgen:
  rpcs.append(make_get_call(base_req, pbs,
                            self.__create_result_index_pairs(indexes)))

My understanding of this:

  • Set max_count from the configuration object, or 1000 as a default
  • If the request must read the current value, set max_gcs_per_rpc from the configuration, or 10 as a default
  • Split the input keys into individual RPCs, using both max_count and max_gcs_per_rpc as limits.

So, this is being done by the Python Datastore library.

Upvotes: 0

Bardia D.
Bardia D.

Reputation: 693

Try splitting up the for loop that goes through each item and the actual get_multi call itself. So something like:

all_values = ndb.get_multi(all_keys)
for counter in all_values:
    # Insert amazeballs codes here

I have a feeling it's one of these:

  1. The generator pattern (yield from for loop) is causing something funky with get_multi execution paths
  2. Perhaps the number of items you are expecting doesn't match actual result counts, which could reveal a problem with GeneralCounterShardConfig.all_keys(name)
  3. The number of shards is set too high. I've realized that anything over 10 shards causes performance issues.

Upvotes: 1

Related Questions