heather
heather

Reputation: 21

How to calculate top n percent of sum with group by and put others in "Other" category similar to crystal reporting does with group sorting/ filtering

Similar to crystal reports where you can implement the custom group sorting to sort and filter by top n percent for the groups of a sum field and place the remaining records in an Other category, how can I do this same in SQL Example of crystal reports feature

CustomerName Cost Fees Month Year
abc customer 200.10 20449.00 1 2021
abc customer 240.10 24440.10 1 2021
abc customer 345.55 3455559.10 2 2021
abc customer 34444.44 45555.55 2 2021
def customer 3401.22 3333.11 1 2021
def customer 23444.43 444442.12 1 2021
ghi customer 3202.11 9873.01 3 2021
ghi customer 5421.01 89033.33 1 2021
jkl customer 22.02 0.02 2 2021
jkl customer 333.03 0.33 3 2021
mno customer 2.05 0.01 3 2021
mno customer 333.05 0.03 2 2021
pqr customer 44.06 0.01 2 2021
pqr customer 44.06 0.01 1 2021

In Microsoft SQL, I need to write a query similar to Crystal Reports that groups by customer, month, and year, and filters the top 90 percent of the sum of the Fees columns and then gathers anything that didnt get in the top 90% in Others summarizing the Cost and Fees in one row.

So assuming jkl, mno, and pqr customers didnt make the top 90%, those records would come back like this

CustomerName Cost Fees Month Year
Others 44.06 0.01 1 2021
Others 399.13 0.06 2 2021
Others 335.08 0.34 3 2021

And the rest of the records that made the top 90% would be their normal customer name, sum of cost, sum of fees, month, year grouped by customername, month, and year

Upvotes: 0

Views: 183

Answers (1)

zelarian
zelarian

Reputation: 256

I'm unsure as to whether by "top 90%" you mean 90th percentile or are less than 90% of the max value. The query for either is basically the same, but 90% of max value is what I think you're going for so it should be as follows:

select sub_query.*
from 
    (
    select CustomerName, sum(Cost) [Cost], sum(Fees) [Fees], Month, Year
    from table
    group by CustomerName, Month, Year 
    ) sub_query
    left join (
        select max(fees) * 0.9 [fees]
        from (
            select CustomerName, sum(Cost) [Cost], sum(Fees) [Fees], Month, Year
            from table
            group by CustomerName, Month, Year 
        ) sub_query2 
    ) filter_query on 1 = 1 /*The subquery only returns 1 row so we need to force a one-to-many join*/
where 
    sub_query.fees < filter_query.fees

After comment: The comment cleared some things up. You want to display the details (aggregates by CustomerName, Month, Year) for all records that are 90% of max fees of that aggregate and append a summary (aggregates by just month, year) of all records that don't meet that filter. This should do what you want.

select sub_query.*
from 
    (
    select CustomerName, sum(Cost) [Cost], sum(Fees) [Fees], Month, Year
    from table
    group by CustomerName, Month, Year 
    ) sub_query
    left join (
        select max(fees) * 0.9 [fees]
        from (
            select CustomerName, sum(Cost) [Cost], sum(Fees) [Fees], Month, Year
            from table
            group by CustomerName, Month, Year 
        ) sub_query2 
    ) filter_query on 1 = 1 /*The subquery only returns 1 row so we need to force a one-to-many join*/
where 
    sub_query.fees >= filter_query.fees
union all
select 'Others', sum(Cost) [Cost], sum(Fees) [Fees], Month, Year
from 
    (
   select CustomerName, sum(Cost) [Cost], sum(Fees) [Fees], Month, Year
    from table
    group by CustomerName, Month, Year 
    ) sub_query
    left join (
        select max(fees) * 0.9 [fees]
        from (
            select CustomerName, sum(Cost) [Cost], sum(Fees) [Fees], Month, Year
            from table
            group by CustomerName, Month, Year 
        ) sub_query2 
    ) filter_query on 1 = 1 /*The subquery only returns 1 row so we need to force a one-to-many join*/
where 
    sub_query.fees < filter_query.fees
group by [Fees], Month, Year

Upvotes: 1

Related Questions