Reputation: 3945
We're on four Amazon EC2 instances (one load balancer, one db, and two app) and are constantly getting random timeouts. We get at least one a day, sometimes more. Here are some examples:
Errno::ETIMEDOUT: Connection timed out - connect(2)
/usr/local/rvm/rubies/ruby-1.9.2-p180/lib/ruby/1.9.1/net/smtp.rb:546:in `initialize'
and
Timeout::Error: execution expired
[GEM_ROOT]/gems/activemodel-3.0.9/lib/active_model/attribute_methods.rb:354:in `match'
I'm not sure how to debug these as they are not related to application code or server load. CPU usage usually hovers below 10% with the biggest spike going up to 60%. The spikes are most likely due to running backups and do not correspond with the times of the timeout errors.
How can these types of errors be tracked down?
Upvotes: 2
Views: 1406
Reputation: 19145
The first timeout looks like a legit connection timeout sending mail via SMTP. Are you hosting your own SMTP server or using a service?
Looks like sendgrid has been experiencing delays/timeouts the last couple of days:
We're currently seeing lots of volume in our queues and emails may be delayed for a short period. Stay tuned for updates. #status
Fix for SMTP Service Timeout/Fails
Setup a local mail relay that will hold mail and re-send if there are failures like this. We use a local Postfix relay in production for just this problem (so ActiveMailer uses sendmail to Postfix, which queues up mail and delivers via SMTP relay to Sendgrid).
Upvotes: 3