Reputation: 165
I want to translate a question to sql type.The question is this: "Most successful SKU in terms of revenue (in USD) per country" So i need the best selling product for each country
My data look like this:
,Platform,Userid,In-app Product,Country,Currency,Timestamp,Transaction Receipt,Price
0,IOS,5.669802981,offer4,US,USD,28/03/2017 02:30,240200367307.75848,1.99
1,ANDROID,6.831350998,offer4,US,USD,27/03/2017 23:23,436138823984.4136,3.99
2,ANDROID,6.831233793,pack2.cash,CA,USD,27/03/2017 18:32,192318559985.1093,4.112
This is what i have tried
select Sum("Price"),"In-app Product","Country" from cleandataset group by "Country","In-app Product" order by "sum" desc,"Country"
The problem is i dont know how to get only the first row from each group.Any help would be appreciated!
My query right now produce something like this:
sum In-app Product Country
2578.591999999975 "offer4" "US"
1917.3325600000014 "offer10" "US"
1885.1750599999978 "pack4.shields" "US"
1387.155899999998 "pack3.shields" "US"
1200.6980999999992 "offer10" "FR"
1150.638300000004 "offer4" "FR"
811.1277999999998 "pack4.shields" "FR"
790.5170000000022 "offer4" "IT"
773.3864000000004 "offer10" "ES"
763.307199999999 "pack2.shields" "US"
735.6938099999996 "pack3.cash" "US"
702.2611000000019 "offer4" "ES"
701.3979000000004 "offer10" "IT"
675.4060400000003 "offer15" "US"
672.7070000000007 "pack3.shields" "FR"
660.80911 "pack5.shields" "US"
So i think i need to select the Product with the biggest sum in each country
Upvotes: 0
Views: 302
Reputation: 1401
with total as (
select
Country,app_product, sum(price) over(partition by country,app_product) as total
from [table]
)
,
ranking as (
select
*, row_number() Over(Partition by country by total desc) as rank_
from total_
)
select
country, app_product, total from ranking where rank_ = 1
Upvotes: 2
Reputation: 1269443
You can use distinct on
directly with group by
:
select distinct on ("Country") Sum("Price"), "In-app Product", "Country"
from cleandatase
group by "Country", "In-app Product"
order by "Country", Sum("Price") desc;
Note: As Thorsten points out, if there are ties and you want all the ties, then distinct on
is not the simplest solution.
Upvotes: 1
Reputation: 13009
You were almost there. Use your query and distinct on
:
select distinct on (country) *
from
(
select country, app_product, sum(price) as "sum"
from the_table
group by country, app_product
) as t
order by country, "sum" desc;
Upvotes: 1
Reputation: 94859
While PostgreSQL has DISTINCT ON
to get only one row per group, this often doesn't suffice, because it cannot deal with ties. If two products in a country have the same top total, then you'll probably want to show both.
You can use this standard SQL query instead:
select
country,
in_app_product,
sum(price) as total
from cleandataset
group by country, in_app_product
order by rank() over (partition by country order by total desc)
fetch first row with ties;
Or this:
select country, in_app_product, total
from
(
select
country,
in_app_product,
sum(price) as total,
max(sum(total)) over (partition by country) as max_total
from cleandataset
group by country, in_app_product
)
where total = max_total;
And there are still other ways to do this.
Upvotes: 1
Reputation: 53
You can resolve it using a ROW_NUMBER
function with PARTITION BY
Product, Country and select the ROW_NUMBER = 1
See https://www.postgresqltutorial.com/postgresql-row_number/
Upvotes: 1