Reputation: 40880
I moved an app from the bamboo stack to cedar and the server is behaving wierdly, it times out on the first request, taking 40sec according to the logs, while the next, same request, takes 1.5 sec. You would think the dyno slept, but I've got 2-3 dynos (using hirefire), and 3 workers.
It's an app with low user interaction, as it's mostly a huge background worker tool. The clients using it are few, but important, I obviously can't have the app timing out every morning or after lunch. You'd hate it too.
➜ my_app git:(master) ✗ heroku logs -t -p web --remote production
first request after some inactivity 40880ms
Started GET "/my_model" for xx.160.183.xxx at 2013-10-31 15:19:34 +0000
Processing by MyModelsController#index as HTML
User Load (6.8ms) SELECT "users".* FROM "users" WHERE "users"."id" = 110237164 LIMIT 1
MyModel Load (12.0ms) SELECT "my_model".* FROM "my_model" ORDER BY name ASC
Rendered my_model/_my_model_statuses.html.erb (3515.6ms)
Rendered my_model/_my_model_statuses.html.erb (3514.8ms)
Rendered my_model/_my_model_statuses.html.erb (3210.6ms)
Rendered my_model/_my_model_statuses.html.erb (3968.4ms)
Rendered my_model/_my_model_statuses.html.erb (2652.0ms)
Rendered my_model/_my_model_statuses.html.erb (2253.5ms)
Rendered my_model/_my_model_statuses.html.erb (1339.7ms)
Rendered my_model/_my_model_statuses.html.erb (1333.0ms)
Rendered my_model/_my_model_statuses.html.erb (2901.9ms)
Rendered my_model/_my_model_statuses.html.erb (636.3ms)
Rendered my_model/_my_model_statuses.html.erb (1598.6ms)
Rendered my_model/_my_model_statuses.html.erb (10.1ms)
Rendered my_model/_my_model_statuses.html.erb (909.3ms)
Rendered my_model/_my_model_statuses.html.erb (911.5ms)
Rendered my_model/_my_model_statuses.html.erb (1435.4ms)
Rendered my_model/_my_model_statuses.html.erb (1043.9ms)
Rendered my_model/_my_model_statuses.html.erb (1712.3ms)
Rendered my_model/_my_model_statuses.html.erb (793.8ms)
Rendered my_model/_my_model_statuses.html.erb (290.2ms)
Rendered my_model/_my_model_statuses.html.erb (1142.0ms)
Rendered my_model/_my_model_statuses.html.erb (342.6ms)
Rendered my_model/_my_model_statuses.html.erb (376.0ms)
Rendered my_model/_my_model_statuses.html.erb (2161.8ms)
Rendered my_model/_my_model_statuses.html.erb (1194.9ms)
Rendered my_model/_my_model_statuses.html.erb (1293.4ms)
Rendered my_model/_my_model_overview.html.erb (40695.0ms)
Rendered my_model/index.html.erb within layouts/application (40721.5ms)
Rendered shared/_menu.html.erb (1.0ms)
Rendered shared/_usermenu.html.erb (2.3ms)
Rendered shared/_header.html.erb (27.1ms)
Rendered shared/_messages.html.erb (0.1ms)
Rendered feedbacks/_new.html.erb (15.8ms)
Completed 200 OK in 40880ms (Views: 40856.2ms | ActiveRecord: 18.8ms)
It's a table, where each row is a partial, extracted so that I could see what took time.
Here it is after the timeout. 1398ms
Started GET "/my_models" for xxx.160.183.xxx at 2013-10-31 15:21:18 +0000
Processing by MyModelsController#index as HTML
User Load (2.4ms) SELECT "users".* FROM "users" WHERE "users"."id" = 110237164 LIMIT 1
MyModel Load (3.2ms) SELECT "my_models".* FROM "my_models" ORDER BY name ASC
Rendered my_models/_my_model_statuses.html.erb (87.7ms)
Rendered my_models/_my_model_statuses.html.erb (25.5ms)
Rendered my_models/_my_model_statuses.html.erb (35.7ms)
Rendered my_models/_my_model_statuses.html.erb (107.8ms)
Rendered my_models/_my_model_statuses.html.erb (163.9ms)
Rendered my_models/_my_model_statuses.html.erb (26.2ms)
Rendered my_models/_my_model_statuses.html.erb (26.3ms)
Rendered my_models/_my_model_statuses.html.erb (19.3ms)
Rendered my_models/_my_model_statuses.html.erb (121.4ms)
Rendered my_models/_my_model_statuses.html.erb (62.6ms)
Rendered my_models/_my_model_statuses.html.erb (94.9ms)
Rendered my_models/_my_model_statuses.html.erb (8.2ms)
Rendered my_models/_my_model_statuses.html.erb (19.1ms)
Rendered my_models/_my_model_statuses.html.erb (15.8ms)
Rendered my_models/_my_model_statuses.html.erb (59.9ms)
Rendered my_models/_my_model_statuses.html.erb (20.6ms)
Rendered my_models/_my_model_statuses.html.erb (78.8ms)
Rendered my_models/_my_model_statuses.html.erb (25.3ms)
Rendered my_models/_my_model_statuses.html.erb (13.3ms)
Rendered my_models/_my_model_statuses.html.erb (130.1ms)
Rendered my_models/_my_model_statuses.html.erb (9.7ms)
Rendered my_models/_my_model_statuses.html.erb (11.1ms)
Rendered my_models/_my_model_statuses.html.erb (92.0ms)
Rendered my_models/_my_model_statuses.html.erb (51.6ms)
Rendered my_models/_my_model_statuses.html.erb (23.1ms)
Rendered my_models/_my_models_overview.html.erb (1385.7ms)
Rendered my_models/index.html.erb within layouts/application (1386.3ms)
Rendered shared/_menu.html.erb (0.3ms)
Rendered shared/_usermenu.html.erb (0.6ms)
Rendered shared/_header.html.erb (1.5ms)
Rendered shared/_messages.html.erb (0.0ms)
Rendered feedbacks/_new.html.erb (1.5ms)
Completed 200 OK in 1398ms (Views: 1388.8ms | ActiveRecord: 5.5ms)
I'm not using any explicit cache, just plain Rails ActiveRecord to populate the table. Locally it's not like this at all. Heroku support recommends concurrency and caching, which is for a high-traffic app. This is a low-traffic app, a tool. They haven't commented on the bamboo vs cedar topic so I'm reaching out here in case anybody learns things I'm not aware of.
In both cases DB < 20 ms
, so this is all CPU stuff, rendering templates.
Upvotes: 0
Views: 639
Reputation: 40880
Yeah... I'll share instead of deleting the question. Hopefully my mistakes will help somebody get to the bottom of this faster.
The problem was with a mongodb query. Not sleeping dynos or services. Simply my own code.
It as Very confusing as MongoHQ has a "Slow queries" page that reported "no slow queries". I'd like to applaud Matt McCay for the excellent support and respectful approach, despite this being a (yet another?) "user error". He showed me the mongodb logging that showed this:
Tue Nov 5 22:59:46.722 [conn2335303] query myapp-production.day_reports query: { $query: { my_model_id: 2147259060, some_unindexed_field: { $gt: 0 } }, $orderby: { some_index_timestamp: -1 } } ntoreturn:1 ntoskip:0 nscanned:775 scanAndOrder:1 keyUpdates:0 locks(micros) r:7474265 nreturned:0 reslen:20 7474ms
Tue Nov 5 22:59:49.439 [conn2335303] query myapp-production.day_reports query: { $query: { my_model_id: 2147259071, some_unindexed_field: { $gt: 0 } }, $orderby: { some_index_timestamp: -1 } } ntoreturn:1 ntoskip:0 nscanned:502 scanAndOrder:1 keyUpdates:0 locks(micros) r:2570470 nreturned:0 reslen:20 2570ms
Show just two of many (30) in the same query. Note the 7474ms and 2570ms
I wonder why these wouldn't be listed as "slow queries" in MongoHQ... Let this be a warning to not let yourself be fooled or blinded by that.
Solution: In my case, I solved this by denormalizing, I saved the calculated value on my model on each write. I could have used indexes too, it would have been really fast, but in this case the models I read are in PostgreSQL and with the denormalization I wouldn't need to connect to MongoDB at all - on this particular request.
Why was it so much faster on the following queries?
The answer is found here: http://docs.mongodb.org/manual/faq/fundamentals/#does-mongodb-handle-caching
MongoDB saves/caches the last fetched data, in RAM.
How should I have troubleshooted this, ideally?
Upvotes: 0
Reputation: 7339
It's how heroku handles resources. It's called dyno-idling and they released this about it last year: https://devcenter.heroku.com/articles/dynos#dyno-idling.
The heorku way to fix it is to pay for an extra dyno, then it will run all the time. But you can also use ping services to ping your site. Some will do it every hour, some every fifteen minutes and some once a minute. That will keep your dyno active and your thin server always "awake"
Upvotes: 0