Reputation: 31
I've been using Copperegg for a while now and have generally been happy with it until lately, where I have had a few issues. It's being used to monitor a number of EC2 instances that must be up 24/7.
Last week I was getting phantom alerts that servers had gone down when they hadn't, which I can cope with, but also I didn't get an alert when I should have done. One server had high CPU for over 5 mins when the alert should be triggered after 1 minute. The Copperegg support weren't not all that helpful, merely agreeing that an alert should have been triggered.
The latter of those problems is unacceptable and if it were to happen again outside of working hours then serious problems will follow.
So, I'm looking for alternative services that will do that same job. I've looked at Datadog and New Relic, but both have a significant problem in that they will only alert me of a problem 5 minutes after it's occurred, rather than the 1 minute I can get with Copperegg.
What else is out there that can do the same job and will also integrate with Pager Duty?
Upvotes: 3
Views: 4262
Reputation: 8890
In short Server Density is a monitoring tool that will monitor all the relevant server metrics. You can take a look at this page where it’s all described.
One server had high CPU for over 5 mins when the alert should be triggered after 1 minute
Server Density’s open source agent collects and posts the data to their server every minute and you can decide yourself when that alert should be triggered. In the alert below you can see that the alert will alert 1 person after 1 minute and then repeatedly alert every 5 minutes.
There is a lot of other metrics that you can alert on too.
What else is out there that can do the same job and will also integrate with Pager Duty?
Server Density also integrates with PagerDuty. The only thing you need to do is to generate an api key at PagerDuty and then provide that in the settings.
Just provide the API key in the settings and you can then in check pagerduty as one of the alert recipients.
You can find the pricing page here. I’ll give you a brief overview of it. The pricing starts at $10 for one server plus one web check and then get’s cheaper per server the more servers you add.
Everything will be monitored once every minute and there is no fees added for the amount of alerts added or triggered, even if that is an SMS to your phone number. The cost is slightly more expensive than the Cloudwatch example, but the support is good. If you used copperegg before they have a migration tool too.
Server Density allows you to monitor all the things! Then only thing you need to do is to send us custom metrics which you can do with a plugin written by yourself or by someone else.
I have to say that the graphs that Server Density provides is somewhat akin to eye candy too. Most other monitoring solutions I’ve seen out there have quite dull dashboards.
It will do the job for you. Not as cheap as CloudWatch, but doesn’t lock you in into AWS. It’ll give you 1 minute frequency metrics and integrate with pagerduty + a lot more stuff.
Upvotes: 3
Reputation: 25687
I believe that Amazon actually offers a service that would accomplish your goal - CloudWatch (pricing). I'm going to take your points one by one. Note that I haven't actually used it before, but the documentation is fairly clear.
One server had high CPU for over 5 mins when the alert should be triggered after 1 minute
It looks like CloudWatch can be configured to send an alert (which I'll get to) after one minute of a condition being met:
One can actually set conditions for many other metrics as well - this is what I see on one of my instances, and I think that detailed monitoring (I use free), might have even more:
What else is out there that can do the same job and will also integrate with Pager Duty?
I'm assuming you're talking about this. It turns out the Pager Duty has a helpful guide just for integrating CloudWatch. How nice!
Here's the pricing page, as you would probably like to parse it instead of me telling you. I'll give a brief overview, though:
You don't want basic monitoring, as it only gives you metrics once per five minutes (which you've indicated is unacceptable.) Instead, you want detailed monitoring (once every minute).
For an EC2 instance, the price for detailed monitoring is $3.50 per instance per month. Additionally, every alarm you make is $0.10 per month. This is actually very cheap if compared to CopperEgg's pricing - $70/mo versus maybe $30 per month for 9 instances and copious amounts of alarms. In reality, you'll probably be paying more like $10/mo.
Pager Duty's tutorial suggests you use SNS, which is another cost. The good thing: it's dirt cheap. $0.60 per million notifications. If you ever get above a dollar in a year for SNS, you need to perform some serious reliability improvements on your servers.
You're not just limited to Amazon's pre-packaged metrics! You can actually send custom metrics (time it took to complete a cronjob, whatever) to Cloudwatch via a PUT request. Quite handy.
Submit Custom Metrics generated by your own applications (or by AWS resources not mentioned above) and have them monitored by Amazon CloudWatch. You can submit these metrics to Amazon CloudWatch via a simple Put API request.
(from here)
So all in all: CloudWatch is quite cheap, can do 1-minute frequency stats, and will integrate with Pager Duty.
Upvotes: 7