Saksham Agarwal
Saksham Agarwal

Reputation: 185

Loss of precision using Sum function over Athena

I have an Athena table where one of the fields is:-

"amount" with type as "double".

When I run the Query:-

SELECT employeeid, SUM(amount)
FROM "audit"."payrollinputsaudit" where paycode = 'APT' AND paydate = '2022-02-25' and partition_0 = '2022' GROUP BY employeeid ;

I get the output as:-

#   employeeid  _col1
1   110825748   303.0
2   111771717   375.83000000000004
3   111815779   268.0
4   108698147   470.0
5   100816149   372.71000000000004

Here for row 2 and row 5 the Sum value has lost precision.

When I fetch individual amount for employeeId :- 111771717

using query:-

SELECT employeeid, amount
FROM "audit"."payrollinputsaudit" where paycode = 'APT' AND paydate = '2022-02-25' AND employeeid = 111771717 ;

The output is:-

#   employeeid  amount
1   111771717   50.88
2   111771717   86.6
3   111771717   18.47
4   111771717   108.72
5   111771717   111.16

Also when I run group by for the same particular employee id using the query:-

SELECT employeeid, SUM(amount)
FROM "audit"."payrollinputsaudit" where paycode = 'APT' AND paydate = '2022-02-25' and partition_0 = '2022' and employeeid = 111771717 GROUP BY employeeid ;

The output is:-

employeeid _col1

1 111771717 375.83000000000004

For other employee id when I do the same:-

Query:-

SELECT employeeid, SUM(amount)
FROM "audit"."payrollinputsaudit" where paycode = 'APT' AND paydate = '2022-02-25' and partition_0 = '2022' and employeeid = 100816149 GROUP BY employeeid ;

Output is:-

#   employeeid  _col1
1   100816149   372.71

So in some cases I see there is loss of precision in adding, The amount stored in my table in upto 2 decimal places only so why is this happening?

Upvotes: 1

Views: 1300

Answers (1)

Guru Stron
Guru Stron

Reputation: 143088

This should happen due to floating-point arithmetic. Cast your amount to decimal with needed precision:

-- sample data
WITH dataset(employeeid, amount) AS (VALUES
    (111771717,   50.88),
    (111771717,   86.6),
    (111771717,   18.47),
    (111771717,   108.72),
    (111771717,   111.16)
  )

-- query    
SELECT sum(cast(amount as DECIMAL(10,2)))
FROM dataset
group by employeeid

Output:

_col0
375.83

Upvotes: 3

Related Questions