Reputation: 97
This table is an aggregated table and count means number of rows we have in the data for the first four columns. I want to assign one and just one organization id to each customer (organization with maximum number of appearance in the data "organization which has highest sum of count").
customer name organization id item city count
Jan Tomas 3478 cloth Rom 20
Jan Tomas 3478 cloth Milan 12
Jan Tomas 3478 shoe Munich 14
Jan Tomas 3478 shoe Rom 5
Jan Tomas 653 cloth Berlin 10
Jan Tomas 653 shoe Brussels 5
Jan Tomas 123 cloth Paris 12
Jan Tomas 123 cloth Rom 14
Martin Muller 654 cloth Rom 15
Martin Muller 654 cloth Berlin 16
Martin Muller 654 shoe Rom 7
Martin Muller 980 cloth Milan 28
Martin Muller 980 shoe Paris 19
Janatan Kery 765 cloth Rom 20
Janatan Kery 765 cloth Munich 11
Janatan Kery 765 shoe Rom 22
Janatan Kery 476 cloth Milan 2
Janatan Kery 476 cloth Rom 24
I would like to have the output as follow. Any help from you would be really appreciated. This is just sample of data . I have more than 2 millions unique customers.
customer name organization id item city count
Jan Tomas 3478 cloth Rom 20
Jan Tomas 3478 cloth Milan 12
Jan Tomas 3478 shoe Munich 14
Jan Tomas 3478 shoe Rom 5
Martin Muller 980 cloth Milan 28
Martin Muller 980 shoe Paris 19
Janatan Kery 765 cloth Rom 20
Janatan Kery 765 cloth Munich 11
Janatan Kery 765 shoe Rom 22
Upvotes: 0
Views: 38
Reputation: 10297
Try this, it uses ROW_NUMBER()
to find the largest SUM(Count)
per CustomerName/OrganizationID
, then joins back to the main table for the rest of the columns:
SELECT yt.*
FROM YourTable yt
JOIN (SELECT CustomerName,
OrganizationID,
ROW_NUMBER() OVER (PARTITION BY CUST ORDER BY SUM(Count) DESC) RN
FROM YourTable
GROUP BY CustomerName, OrganizationID) A ON A.RN = 1
AND A.CustomerName = yt.CustomerName
AND A.OrganizationID = yt.OrganizationID
Upvotes: 1
Reputation: 2729
This would be how I'd do it in MS SQL Server. Maybe it will lead you in the right direction.
WITH cte
AS (SELECT [customer name]
, [organization id]
, item
, city
, COUNT
, ROW_NUMBER() OVER (PARTITION BY item, city ORDER BY COUNT DESC) AS RN
FROM TableName
)
SELECT [customer name]
, [organization id]
, item
, city
, COUNT
FROM cte
WHERE RN = 1
Upvotes: 0