David Flynn
David Flynn

Reputation: 105

Selecting the most frequent value in a column based on the value of another column in the same row?

So basically what I'm trying to do is generate a report for our stores. We have an incident report website where the employees can report an incident that takes place at any of our stores. So in the general report I'm trying to generate, I want to show the details for each store we have (Five stores). This would include the name of the store, number of incidents, oldest incident date, newest incident date, and then the most recurring type of incident at each store.

SELECT  Store.Name AS [Store Name], COUNT(*) AS [No. Of Incidents], Min(CAST(DateNotified AS date)) AS [Oldest Incident], Max(CAST(DateNotified AS date)) AS [Latest Incident], 
        (   SELECT TOP 1    IncidentType.Details
            FROM            IncidentDetails
            INNER JOIN Store ON IncidentDetails.StoreID = Store.StoreID
            INNER JOIN IncidentType On IncidentDetails.IncidentTypeID = IncidentType.IncidentTypeID
            Group By IncidentType.Details, IncidentDetails.StoreID
            Order By COUNT(IncidentType.Details) DESC) AS [Most Freqeuent Incident]
FROM IncidentDetails
INNER JOIN Store ON IncidentDetails.StoreID = Store.StoreID
INNER JOIN IncidentType On IncidentDetails.IncidentTypeID = IncidentType.IncidentTypeID
GROUP BY Store.Name

Just to make it clear, the IncidentDetails table stores all the details about the incident including which store it occured at, what the type of incident was, time/date, etc. What this does though is it gives me 5 rows for each store, but the [Most Frequent Incident] value is the same for every row. Basically, it gets the most frequent incident value for the whole table, regardless of which store it came from, and then displays that for each store, even though different stores have different values for the column.

I've been trying to solve this for a while now but haven't been able to :-(

Upvotes: 0

Views: 139

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1271141

You have too many joins and no correlation clause.

There are several ways to approach this problem. You have already started with an aggregation in the outer query and then a nested subquery. So, this continues that approach. I think this does what you want:

SELECT s.Name AS [Store Name], COUNT(*) AS [No. Of Incidents],
       Min(CAST(DateNotified AS date)) AS [Oldest Incident],
       Max(CAST(DateNotified AS date)) AS [Latest Incident], 
       (SELECT TOP 1 it.Details
        FROM IncidentDetails id2 INNER JOIN
             IncidentType it2
             On id2.IncidentTypeID = it2.IncidentTypeID
        WHERE id2.StoreId = s.StoreId
        Group By it.Details
        Order By COUNT(*) DESC
       ) AS [Most Freqeuent Incident]
FROM IncidentDetails id INNER JOIN
     Store s
     ON id.StoreID = s.StoreID 
GROUP BY s.Name, s.StoreId;

Notes:

  • Removed the IncidentType table from the outer joins. This doesn't seem needed (although it could be used for filtering).
  • Added s.StoredId to the group by clause. This is needed for the correlation in the subquery.
  • Added a where clause so the subquery is only processed once for each store in the outer query.
  • Removed the join to Store in the subquery. It seems unnecessary, if the queries can be correlated on StoreId.
  • Changed the group by in the subquery to use Details. That is the value being selected.
  • Added table aliases, which make queries easier to write and to read.

Upvotes: 2

Related Questions