Ambrose
Ambrose

Reputation: 163

Select top 5 score results per group in psql

I need some help with how to select the top 5 "most critical servers per cutomer".

Currently I have a select statement which returns the Hostname, SystemID, Customer and Critical values for all the machines. What I need to do next with this query is to only select the top 5 most critical(highest score in Critical) for each Customer.

My current select statement looks like this:

SELECT COALESCE(rhss_py_results.Hostname, sid_py_results.Name) AS Hostname, COALESCE(rhss_py_results.SystemID, security_scores.SystemID) AS SystemID, Customer, Critical
FROM rhss_py_results
INNER JOIN sid_py_results
ON rhss_py_results.hostname = sid_py_results.Name
INNER JOIN Customers
ON sid_py_results.SecurityDomain = Customers.SecurityDomain
INNER JOIN security_scores
ON rhss_py_results.SystemID=security_scores.SystemID
ORDER BY Customer;

and it returns something along the lines of this: (data changes due to privacy)

     hostname      |  systemid  |         customer         | critical
-------------------+------------+--------------------------+----------
 aaa-aaaa_aaaa     | 1000000024 | Anna                     |       48
 aaa-aaa3-aaa1     | 1000000038 | Anna                     |        5
 aaaaaa001         | 1000000013 | Kalle                    |       10
 aaaaaa002         | 1000000043 | Kalle                    |        1
 aaaaaa005         | 1000000087 | Pelle                    |        5
 bbbbbb0010        | 1000000003 | Pelle                    |        0
 cccccc0001        | 1000000029 | Sara                     |        0
 ddd-dddd-c001     | 1000000063 | Anna                     |       26
 ddd-dddd-c002     | 1000000064 | Anna                     |       24
 ddd-dddd-c003     | 1000000012 | Anna                     |        5
 fff-ffff-f001     | 1000000095 | Anna                     |       13
 gggggg0001        | 1000000077 | Sara                     |        0
 gggggg0002        | 1000000040 | Pelle                    |        0
 gggggg0003        | 1000000039 | Pelle                    |        1
 mmmmmm033         | 1000000047 | Kalle                    |       31
 mmmmmm034         | 1000000045 | Kalle                    |       37
 mmmmmm036         | 1000000046 | Pelle                    |        3
 mmmmmm037         | 1000000082 | Pelle                    |        3
 mmmmmm045         | 1000000091 | Håkan                    |        0

Some Customers only have 1 server, others have 15 and if there is only 1 server for a customer it is enough to only list that one. If a customer has several servers with the same critical value tied for the top 5, returning the most relevant 5 depending on hostname is fine.

I have over 32 different customers and this number may vary in the future.

The following result should be the final product(ish):

     hostname      |  systemid  |         customer         | critical
-------------------+------------+--------------------------+----------
 aaa-aaaa_aaaa     | 1000000024 | Anna                     |       48
 ddd-dddd-c001     | 1000000063 | Anna                     |       26
 ddd-dddd-c002     | 1000000064 | Anna                     |       24
 fff-ffff-f001     | 1000000095 | Anna                     |       13
 aaa-aaa3-aaa1     | 1000000038 | Anna                     |        5
 mmmmmm045         | 1000000091 | Håkan                    |        0
 mmmmmm034         | 1000000045 | Kalle                    |       37
 mmmmmm033         | 1000000047 | Kalle                    |       31
 aaaaaa001         | 1000000013 | Kalle                    |       10
 aaaaaa002         | 1000000043 | Kalle                    |        1
 aaaaaa005         | 1000000087 | Pelle                    |        5
 mmmmmm036         | 1000000046 | Pelle                    |        3
 mmmmmm037         | 1000000082 | Pelle                    |        3
 gggggg0003        | 1000000039 | Pelle                    |        1
 bbbbbb0010        | 1000000003 | Pelle                    |        0
 cccccc0001        | 1000000029 | Sara                     |        0
 gggggg0001        | 1000000077 | Sara                     |        0

I have read the following article, but don't understand how to apply it to my own query as I am quite new with databases. http://www.xaprb.com/blog/2006/12/07/how-to-select-the-firstleastmax-row-per-group-in-sql/

Is there anyone who would be able to help me with this issue?

//Ambrose

Upvotes: 2

Views: 174

Answers (2)

Madhivanan
Madhivanan

Reputation: 13700

Use row_number function

SELECT *
FROM
  (SELECT (row_number() over (partition BY Customer
                             ORDER BY COALESCE(rhss_py_results.SystemID, security_scores.SystemID) DESC)) AS sno,
                       COALESCE(rhss_py_results.Hostname, sid_py_results.Name) AS Hostname,
                       COALESCE(rhss_py_results.SystemID, security_scores.SystemID) AS SystemID,
                       Customer,
                       Critical
   FROM rhss_py_results
   INNER JOIN sid_py_results ON rhss_py_results.hostname = sid_py_results.Name
   INNER JOIN Customers ON sid_py_results.SecurityDomain = Customers.SecurityDomain
   INNER JOIN security_scores ON rhss_py_results.SystemID=security_scores.SystemID
   ORDER BY Customer) AS t
WHERE sno<=5;

Upvotes: 3

Shahzad Riaz
Shahzad Riaz

Reputation: 354

Use order by Critical desc..

SELECT COALESCE(rhss_py_results.Hostname, sid_py_results.Name) AS Hostname, COALESCE(rhss_py_results.SystemID, security_scores.SystemID) AS SystemID, Customer, Critical
FROM rhss_py_results
INNER JOIN sid_py_results
ON rhss_py_results.hostname = sid_py_results.Name
INNER JOIN Customers
ON sid_py_results.SecurityDomain = Customers.SecurityDomain
INNER JOIN security_scores
ON rhss_py_results.SystemID=security_scores.SystemID
ORDER BY Customer,Critical desc;

Upvotes: 0

Related Questions