Grouping by column and calculating percentage, windows function partition by

Question

I have the following example table "invoices".

+----------------+-------------+--------+
| invoice_number | date        | status |
+----------------+-------------+--------+
| 1              | 2 apr 2021  | 1      |
+----------------+-------------+--------+
| 2              | 9 apr 2021  | 0      |
+----------------+-------------+--------+
| 3              | 9 apr 2021  | 1      |
+----------------+-------------+--------+
| 4              | 9 apr 2021  | 1      |
+----------------+-------------+--------+
| 5              | 16 apr 2021 | 1      |
+----------------+-------------+--------+
| 6              | 16 apr 2021 | 0      |
+----------------+-------------+--------+
| 7              | 16 apr 2021 | 0      |
+----------------+-------------+--------+
| 8              | 16 apr 2021 | 0      |
+----------------+-------------+--------+
| 9              | 16 apr 2021 | 1      |
+----------------+-------------+--------+

(in status, 1 is for paid, 0 for unpaid) and from it I'm trying to get the following:

the number of invoices per week (each date is a week so a group per date works)
the percentage of these invoices that were paid (grouped by date)

I was trying to use a window function to organize better (since I have several more fields, this is just simplified for the example)

and I was doing

select date,
       count(invoice_number) over (partition by date) as NumberOfInvoices,
       (sum(status)/count(status) over (partition by date))*100 as percentagePaid
from invoices

but of course this is not working, at the same time I'm getting all the rows for the table as result, instead of them grouped by date.

should I stop trying to use the over partition by here? or am I just applying it incorrectly for what I need?

Gordon Linoff · Accepted Answer

the percentage of these invoices that were paid (grouped by date)

This would simply be aggregation:

select date, avg(status * 1.0) as paid_ratio
from invoices i
group by date;

If you wanted this per row, then you would use window functions:

select i.*,
       avg(i.status * 1.0) over (partition by i.date) as paid_ratio
from invoices i;

Note the * 1.0. SQL Server does integer division -- and averages -- on integers. status looks like an integer, so the * 1.0 converts it to a number with decimal places.

Grouping by column and calculating percentage, windows function partition by

Answers (1)

Related Questions