Zanko
Zanko

Reputation: 4694

Postgresql Numeric precision error on window function

This is my query

Example 1

SELECT sum(percent::numeric)::numeric
FROM (
         SELECT (amount::numeric * 100::numeric) / sum(amount::numeric) OVER (PARTITION BY a, b, c)::numeric as percent
         FROM (
                  values ('a', 'b', 'c', 1, '609639'::numeric)
                       , ('a', 'b', 'c', 2, '72296.66'::numeric)
                       , ('a', 'b', 'c', 10, '174027.6'::numeric)
                       , ('a', 'b', 'c', 12, '154179.16'::numeric)
                       , ('a', 'b', 'c', 14, '97530.2'::numeric)
                       , ('a', 'b', 'c', 15, '1335529.7'::numeric)
                       , ('a', 'b', 'c', 25, '201315.1'::numeric)
                       , ('a', 'b', 'c', 51, '1424121.5'::numeric)
                       , ('a', 'b', 'c', 61, '51045.5'::numeric)
                       , ('a', 'b', 'c', 63, '14664.4'::numeric)
                       , ('a', 'b', 'c', 66, '740921.9'::numeric)
                       , ('a', 'b', 'c', 70, '189266.9'::numeric)
              ) AS sample_values (a, b, c, d, amount)
     ) as table_1;

sum
-------------------------
 99.99999999999999997676
(1 row)

Example 2

SELECT sum(percent::numeric)::numeric
FROM (
         SELECT (amount::numeric * 100::numeric) / sum(amount::numeric) OVER (PARTITION BY a, b, c)::numeric as percent
         FROM (
                  values  ('a', 'b', 'c', 2, '72296.66'::numeric)
                       , ('a', 'b', 'c', 10, '174027.6'::numeric)
                       , ('a', 'b', 'c', 12, '154179.16'::numeric)
                       , ('a', 'b', 'c', 15, '1335529.7'::numeric)
                       , ('a', 'b', 'c', 25, '201315.1'::numeric)
                       , ('a', 'b', 'c', 51, '1424121.5'::numeric)
                       , ('a', 'b', 'c', 61, '51045.5'::numeric)
                       , ('a', 'b', 'c', 63, '14664.4'::numeric)
                       , ('a', 'b', 'c', 66, '740921.9'::numeric)
                       , ('a', 'b', 'c', 70, '189266.9'::numeric)
              ) AS sample_values (a, b, c, d, amount)
     ) as table_1;

sum
-------------------------
 100.0000000000000000583
(1 row)

I am casting everything to numeric and yet the query still gives me

Details

psql --version
psql (PostgreSQL) 13.1

postgres=# select version();

                          version
------------------------------------------------------------
 PostgreSQL 13.1, compiled by Visual C++ build 1914, 64-bit
(1 row)

Windows 10 x64

I am expecting a clean 100 since numeric shouldn't have floating point error. What went wrong?

Upvotes: 2

Views: 188

Answers (1)

Zanko
Zanko

Reputation: 4694

I have read the source code it seems that division tried its best to determine the appropriate scale for numeric data type such that it will not perform worse than float.

This results in decimal places of ~16-19 digits used in calculation which result in precision error. https://github.com/postgres/postgres/blob/472e518a44eacd9caac7d618f1b6451672ca4481/src/interfaces/ecpg/pgtypeslib/numeric.c#L1044

SELECT (amount::numeric * 100::numeric)::numeric / sum(amount::numeric) OVER (PARTITION BY a, b, c) as percent
FROM (
         values ('a', 'b', 'c', 15, '1335529.7'::numeric)
              , ('a', 'b', 'c', 63, '14664.4'::numeric)
              , ('a', 'b', 'c', 70, '189266.9'::numeric)
     ) AS sample_values (a, b, c, d, amount)

        percent
------------------------
    86.7530713671863074
 0.95256716474142573277
    12.2943614680722669

One way to override the scale is to manually set one of the numeric to use higher scale so that global scale is overridden;

SELECT (amount::numeric * 100::numeric)::numeric / sum(amount::numeric) OVER (PARTITION BY a, b, c) as percent
FROM (
         values ('a', 'b', 'c', 15, '1335529.7'::numeric)
              , ('a', 'b', 'c', 63, '14664.4'::numeric)
              , ('a', 'b', 'c', 70, '189266.9'::numeric(100,50))
     ) AS sample_values (a, b, c, d, amount)

                        percent
-------------------------------------------------------
 86.75307136718630741538759344991526254968459740129825
  0.95256716474142573277270421270821410870428026432628
 12.29436146807226685183970233737652334161112233437547
(3 rows)

With higher scale precision, we can avoid precision error.

If anybody know how to globally set numeric scale for the database please comment, I prefer not to manually set in the query as so. Thank you!

Upvotes: 2

Related Questions