Reputation: 602
I'm looking for an elegant way (in terms of syntax, not necessarily efficient) to get the frequency distribution of a decimal range.
For example, I have a table with ratings column which can be a negative or positive. I want to get the frequency of rows with a rating of certain range. - ... - [-140.00 to -130.00): 5 - [-130.00 to -120.00): 2 - [-120.00 to -110.00): 1 - ... - [120.00 to 130.00): 17 - and so on.
[i to j) means i inclusive to j exclusive.
Thanks in advance.
Upvotes: 2
Views: 2348
Reputation: 44250
Just put your desired ranges into a table, and use that to discriminate the values.
-- SET search_path='tmp';
DROP TABLE measurements;
CREATE TABLE measurements
( zval INTEGER NOT NULL PRIMARY KEY
);
INSERT INTO measurements (zval)
SELECT generate_series(1,1000);
DELETE FROM measurements WHERE random() < 0.20 ;
DROP TABLE ranges;
CREATE TABLE ranges
( zmin INTEGER NOT NULL PRIMARY KEY
, zmax INTEGER NOT NULL
);
INSERT INTO ranges(zmin,zmax) VALUES
(0, 100), (100, 200), (200, 300), (300, 400), (400, 500),
(500, 600), (600, 700), (700, 800), (800, 900), (900, 1000)
;
SELECT ra.zmin,ra.zmax
, COUNT(*) AS zcount
FROM ranges ra
JOIN measurements me
ON me.zval >= ra.zmin AND me.zval < ra.zmax
GROUP BY ra.zmin,ra.zmax
ORDER BY ra.zmin
;
Results:
zmin | zmax | zcount
------+------+--------
0 | 100 | 89
100 | 200 | 76
200 | 300 | 76
300 | 400 | 74
400 | 500 | 86
500 | 600 | 78
600 | 700 | 75
700 | 800 | 75
800 | 900 | 80
900 | 1000 | 82
(10 rows)
Upvotes: 0
Reputation: 10469
You could get pretty close using 'select floor(rating / 10), count(*) from (table) group by 1'
Upvotes: 12
Reputation: 96544
I was thinking of seomthing that could do many levels like
DELIMITER $$
CREATE PROCEDURE populate_stats()
BEGIN
DECLARE range_loop INT Default 500 ;
simple_loop: LOOP
SET the_next = range_loop - 10;
Select sum(case when range between range_loop and the_next then 1 else 0 end) from table,
IF the_next=-500 THEN
LEAVE simple_loop;
END IF;
END LOOP simple_loop;
END $$
usage: call populate_stats();
Would handle 100 ranges from 500-490, 490-480, ... -480 - -490, -490 - -500
Upvotes: 1
Reputation: 35333
assuming a finite number of ranges.
Select
sum(case when val between -140 to -130 then 1 else 0 end) as sum-140_to_-130,
sum(Case when val between -130 to -120 then 1 else 0 end) as sum-130_to_-140,
...
FROM table
and if not, you could use dynamic SQL to generate the select allowing a number of ranges however you may run into a column limitation.
Upvotes: 0