Reputation: 4694
This is my query
Example 1
SELECT sum(percent::numeric)::numeric
FROM (
SELECT (amount::numeric * 100::numeric) / sum(amount::numeric) OVER (PARTITION BY a, b, c)::numeric as percent
FROM (
values ('a', 'b', 'c', 1, '609639'::numeric)
, ('a', 'b', 'c', 2, '72296.66'::numeric)
, ('a', 'b', 'c', 10, '174027.6'::numeric)
, ('a', 'b', 'c', 12, '154179.16'::numeric)
, ('a', 'b', 'c', 14, '97530.2'::numeric)
, ('a', 'b', 'c', 15, '1335529.7'::numeric)
, ('a', 'b', 'c', 25, '201315.1'::numeric)
, ('a', 'b', 'c', 51, '1424121.5'::numeric)
, ('a', 'b', 'c', 61, '51045.5'::numeric)
, ('a', 'b', 'c', 63, '14664.4'::numeric)
, ('a', 'b', 'c', 66, '740921.9'::numeric)
, ('a', 'b', 'c', 70, '189266.9'::numeric)
) AS sample_values (a, b, c, d, amount)
) as table_1;
sum
-------------------------
99.99999999999999997676
(1 row)
Example 2
SELECT sum(percent::numeric)::numeric
FROM (
SELECT (amount::numeric * 100::numeric) / sum(amount::numeric) OVER (PARTITION BY a, b, c)::numeric as percent
FROM (
values ('a', 'b', 'c', 2, '72296.66'::numeric)
, ('a', 'b', 'c', 10, '174027.6'::numeric)
, ('a', 'b', 'c', 12, '154179.16'::numeric)
, ('a', 'b', 'c', 15, '1335529.7'::numeric)
, ('a', 'b', 'c', 25, '201315.1'::numeric)
, ('a', 'b', 'c', 51, '1424121.5'::numeric)
, ('a', 'b', 'c', 61, '51045.5'::numeric)
, ('a', 'b', 'c', 63, '14664.4'::numeric)
, ('a', 'b', 'c', 66, '740921.9'::numeric)
, ('a', 'b', 'c', 70, '189266.9'::numeric)
) AS sample_values (a, b, c, d, amount)
) as table_1;
sum
-------------------------
100.0000000000000000583
(1 row)
I am casting everything to numeric
and yet the query still gives me
Details
psql --version
psql (PostgreSQL) 13.1
postgres=# select version();
version
------------------------------------------------------------
PostgreSQL 13.1, compiled by Visual C++ build 1914, 64-bit
(1 row)
Windows 10 x64
I am expecting a clean 100
since numeric shouldn't have floating point error. What went wrong?
Upvotes: 2
Views: 188
Reputation: 4694
I have read the source code it seems that division tried its best to determine the appropriate scale for numeric data type such that it will not perform worse than float.
This results in decimal places of ~16-19 digits used in calculation which result in precision error. https://github.com/postgres/postgres/blob/472e518a44eacd9caac7d618f1b6451672ca4481/src/interfaces/ecpg/pgtypeslib/numeric.c#L1044
SELECT (amount::numeric * 100::numeric)::numeric / sum(amount::numeric) OVER (PARTITION BY a, b, c) as percent
FROM (
values ('a', 'b', 'c', 15, '1335529.7'::numeric)
, ('a', 'b', 'c', 63, '14664.4'::numeric)
, ('a', 'b', 'c', 70, '189266.9'::numeric)
) AS sample_values (a, b, c, d, amount)
percent
------------------------
86.7530713671863074
0.95256716474142573277
12.2943614680722669
One way to override the scale is to manually set one of the numeric to use higher scale so that global scale is overridden;
SELECT (amount::numeric * 100::numeric)::numeric / sum(amount::numeric) OVER (PARTITION BY a, b, c) as percent
FROM (
values ('a', 'b', 'c', 15, '1335529.7'::numeric)
, ('a', 'b', 'c', 63, '14664.4'::numeric)
, ('a', 'b', 'c', 70, '189266.9'::numeric(100,50))
) AS sample_values (a, b, c, d, amount)
percent
-------------------------------------------------------
86.75307136718630741538759344991526254968459740129825
0.95256716474142573277270421270821410870428026432628
12.29436146807226685183970233737652334161112233437547
(3 rows)
With higher scale precision, we can avoid precision error.
If anybody know how to globally set numeric scale for the database please comment, I prefer not to manually set in the query as so. Thank you!
Upvotes: 2