Select n first elements such that accumulated sum reaches a given value

Question

Starting from the PostgreSQL query

SELECT filename FROM files_storage ORDER BY date;

I would like to reduce the output table to the first n rows such that the accumulated sum from row 1 to n of a column called size reaches at least max_value.

Example:

date         filename   size
2016-09-01   /a/aaa/    20
2016-09-02   /a/bbb/    70
2016-09-03   /a/ccc/    20
2016-09-04   /a/ddd/    30
2016-09-05   /a/eee/    50

If max_value is 100, I want to return the first three rows because 20 + 70 + 20 >= 100.

I have seen answers here to similar questions, but nothing in PostgreSQL.

Gordon Linoff · Accepted Answer

Use the cumulative sum functionality:

SELECT fs.*
FROM (SELECT fs.*, SUM(size) OVER (ORDER BY date) as running_sum
      FROM files_storage
     ) fs
WHERE running_sum >= 100 AND running_sum - size < 100;

Oh, that gets the first row that crosses the boundary.

You want all of them, so instead:

SELECT fs.*
FROM (SELECT fs.*, SUM(size) OVER (ORDER BY date) as running_sum
      FROM files_storage
     ) fs
WHERE running_sum - size < 100;

If you can have duplicate dates and arbitrarily want one value when duplicate values on the same date might apply:

SELECT fs.*
FROM (SELECT fs.*,
             SUM(size) OVER (ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) as running_sum
      FROM files_storage
     ) fs
WHERE running_sum - size < 100;

Select n first elements such that accumulated sum reaches a given value

Answers (2)

Related Questions