Reputation: 319
Why are my dynamodb requests via boto:get_item so slow and too frequently very slow? The AWS console reports that my get latency has hit a high of 12.5ms. None of my requests are anywhere near that low.
Python 2.7.5 AWS region us-west-1 boto 2.31.1 dynamodb table size ~180k records
Code:
from boto.dynamodb2.fields import HashKey
from boto.dynamodb2.table import Table
from boto.dynamodb2.types import STRING
import boto.dynamodb2
import time
REGION = "us-west-1"
AWS_KEY = "xxxxx"
AWS_SECRET = "xxxxx"
start = time.time()
peeps = ("cefbdadf518f44da8a68e35b2321bb1f", "7e3a691df6134a4f83d381a5507cbb18")
connection = boto.dynamodb2.connect_to_region(REGION, aws_access_key_id=AWS_KEY, aws_secret_access_key=AWS_SECRET)
users = Table("users-test", schema=[HashKey("id", data_type=STRING)], connection=connection)
for peep in peeps:
user = users.get_item(consistent=True, id=peep)
print time.time() - start
Results:
(botot)➜ ~ python test2.py
0.056941986084
0.0681240558624
(botot)➜ ~ python test2.py
1.05709600449
1.06937909126
(botot)➜ ~ python test2.py
0.048614025116
0.0575139522552
(botot)➜ ~ python test2.py
0.0553398132324
0.064425945282
(botot)➜ ~ python test2.py
3.05251288414
3.06584000587
(botot)➜ ~ python test2.py
0.0579640865326
0.0699849128723
(botot)➜ ~ python test2.py
0.0530469417572
0.0628390312195
(botot)➜ ~ python test2.py
1.05059504509
1.05963993073
(botot)➜ ~ python test2.py
1.05139684677
1.0603158474
update 2014-07-11 08:03 PST The actual use-case is looking up a user for each web request. As @gamaat said, the cost for DynamoDB is on the first lookup because thats when the HTTPS connection is made. So it seems if I can store the DynamoDB connection between requests and reuse it, things would go faster. So I used werkzeug.contrib.cache.FileSystemCache to store the connection but it never seems to actually store the connection for retrieval. Other values get stored fine, just not this connection object. Any ideas? And if this is not a good way to store the connection between requests, then what is?
update 2014-07-11 15:30 PST Since I'm using supervisor and uwsgi to manage my Flask app, it seems that the problem is actually how can I share the connection object between requests for my Flask app.
Upvotes: 4
Views: 751
Reputation: 319
The solution to the question that appears to be yielding better response times (before average response time was ~500ms, and after it is ~50ms) was to do two things:
1) put the Boto DynamoDB connection object in default_settings.py so that it gets loaded in once into app.config["DYNDB_CONN"] per application load; and
2) configure uwsgi to have a cheaper value of num_proccesses - 1, and cheaper-initial of num_proccesses - 1. This tells uwsgi to always to have num_processes - 1 uwsgi processes running at all times with the option of starting up one more process if load requires it.
I did this to minimize the number of uwsgi processes that would restart and therefore create a new Boto DynamoDB connection object (incurring HTTP connection setup costs).
Upvotes: 4