Reputation: 15039
I have a SQL Server
Database table with about 100K records of invoices (date, provider, type and amount).
This is sample data:
I want to group my data into 4 segments depending on the amount.
Where X1, X2 and X3 are amounts, but these amounts should be calculated statistically based on normal distribution so they make sense.
So I guess I have to get my AVG(amount) and my STDEV(amount) in order to know what my X1, X2 and X3 values should be acceptable.
Any clue on how to do this so the values makes sense for a report so I can generate a chart like this:
Upvotes: 1
Views: 1812
Reputation: 521914
SELECT CASE WHEN Amount < (SELECT AVG(Amount) FROM yourTable) -
(SELECT STDEVP(Amount) FROM yourTable)
THEN 'Group 1'
WHEN Amount < (SELECT AVG(Amount) FROM yourTable) THEN 'Group 2'
WHEN Amount < (SELECT AVG(Amount) FROM yourTable) +
(SELECT STDEVP(Amount) FROM yourTable)
THEN 'Group 3'
ELSE 'Group 4'
END
FROM yourTable
Explanation:
I assume that your boundaries correspond to the following:
X1 - One standard deviation below the mean Amount
X2 - The mean value of the Amount
X3 - One standard deviation about the mean Amount
Hence I assume that your X markers define the quartiles of the Amount distribution. Subqueries can be used for both the mean and standard deviation, and the AVG
and STDEVP
functions can be used for this.
Upvotes: 1