Toad
Toad

Reputation: 15925

best way to statistically detect anomalies in data

our webapp collects huge amount of data about user actions, network business, database load, etc etc etc

All data is stored in warehouses and we have quite a lot of interesting views on this data.

if something odd happens chances are, it shows up somewhere in the data.

However, to manually detect if something out of the ordinary is going on, one has to continually look through this data, and look for oddities.

My question: what is the best way to detect changes in dynamic data which can be seen as 'out of the ordinary'.

Are bayesan filters (I've seen these mentioned when reading about spam detection) the way to go?

Any pointers would be great!

EDIT: To clarify the data for example shows a daily curve of database load. This curve typically looks similar to the curve from yesterday In time this curve might change slowly.

It would be nice that if the curve from day to day changes say within some perimeters, a warning could go off.

R

Upvotes: 6

Views: 3576

Answers (4)

Jouni K. Seppänen
Jouni K. Seppänen

Reputation: 44128

This question is impossible to answer without knowing much more about the particular data you have. For an overview of what kinds of approaches exist, see Anomaly Detection: A Survey by Chandola, Banerjee, and Kumar.

Upvotes: 4

Carlos Rendon
Carlos Rendon

Reputation: 6222

Take a look at Control Charts, they provide a way to track changes in your data visually and specify when the data is "out of control" or "anomalous". They are heavily used in manufacturing to ensure quality control.

Upvotes: 5

Alix Axel
Alix Axel

Reputation: 154553

Bayesian classification might help you find some anomalies in your data, depending on the type of data and how good you train your Bayesian filter.

There is even one available as a web service @ uClassify.com.

Upvotes: 1

aehlke
aehlke

Reputation: 15831

This depends so much on what the data is. Take a statistics class and learn the basics first. This isn't usually an easy or simple problem.

Upvotes: 1

Related Questions