yung peso
yung peso

Reputation: 1766

How can I correctly apply SELECT DISTINCT ON to my SUM() query

I have this query:

SELECT SUM(change_percent) 
FROM api.security_price 
WHERE date = (SELECT MAX(date) FROM api.security_price)
AND symbol in ('AAPL','TSLA','MSFT');

It runs a little slow, with indexes included. Instead, I want to test out SELECT DISTINCT ON (symbol).

Here is why I want to use DISTINCT ON:

However, I'm failing to do so:

Here is how I try to implemenmt:

SELECT SUM(change_percent) 
FROM 

     (SELECT DISTINCT ON (symbol) symbol, change_percent FROM api.security_price) AS a 

WHERE a.symbol in ('AAPL','TSLA','MSFT');

This query is working (and its much faster), but I'm not getting the correct calculation. How can I set everything up correctly in my new query?

Upvotes: 1

Views: 31

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1269463

Your original query should be fine:

SELECT SUM(change_percent) 
FROM api.security_price 
WHERE date = (SELECT MAX(date) FROM api.security_price) AND
      symbol IN ('AAPL', 'TSLA', 'MSFT');

I would recommend an index on (date, symbol).

If you want the max date per symbol, then the query would be:

SELECT SUM(change_percent) 
FROM (SELECT DISTINCT ON (symbol) sp.*
      FROM api.security_price sp
      WHERE symbol IN ('AAPL', 'TSLA', 'MSFT')
      ORDER BY symbol, date DESC
     ) sp

However, this is not guaranteed to be equivalent to your original query.

Upvotes: 1

Related Questions