Ben Whitely
Ben Whitely

Reputation: 59

Issues with grouping this SQL query

So I'm trying to find the month/year that had the highest number of sale transactions.

my query currently is:

SELECT
   DATENAME(M, OrderDate) as orderMonth, 
   year(OrderDate) as orderYear, 
   count(SalesOrderID) as orderCount
FROM
   Sales.SalesOrderHeader soh
GROUP BY OrderDate
HAVING SUM(soh.SalesOrderID) >= ALL (
SELECT SUM(SalesOrderID) FROM Sales.SalesOrderHeader
GROUP BY OrderDate
)

however if I run everything above the HAVING line so that it returns all columns instead of just the highest column, it returns several duplicates of months/years and the orderCounts. for example, June 2011 has about 30 rows being returned in this query, each of those ranging somewhere between 2 and 11 orderCounts, in total the query returns 1124 rows, where it should only be returning 38 since the sales range from 2011 - 2014 and there's 38 months total within that range.

I'm pretty sure I need to specify a monthly group and should be changing my GROUP BYs to something like:

GROUP BY DATENAME(month, soh.OrderDate), DATENAME(YYYY, soh.OrderDate)

but then i get an error "Each GROUP BY expression must contain at least one column that is not an outer reference"

Upvotes: 0

Views: 329

Answers (2)

Gordon Linoff
Gordon Linoff

Reputation: 1269633

Your problem is that you are aggregating by OrderDate rather than by the month and year. So, your version of the query should look like:

SELECT DATENAME(MONTH, OrderDate) as orderMonth, 
       YEAR(OrderDate) as orderYear, 
       COUNT(*) as orderCount
FROM Sales.SalesOrderHeader soh
GROUP BY DATENAME(MONTH, OrderDate), YEAR(OrderDate)
HAVING COUNT(*) >= ALL (SELECT COUNT(*)
                        FROM Sales.SalesOrderHeader soh2
                        GROUP BY DATENAME(MONTH, OrderDate), YEAR(OrderDate)
                       );

However, no one would really write the query like that. It is simpler and more performant to use TOP and ORDER BY. The equivalent of your query is:

SELECT TOP (1) WITH TIES DATENAME(MONTH, OrderDate) as orderMonth, 
       YEAR(OrderDate) as orderYear, 
       COUNT(*) as orderCount
FROM Sales.SalesOrderHeader soh
GROUP BY DATENAME(MONTH, OrderDate), YEAR(OrderDate)
ORDER BY orderCount DESC;

Both these return all months with the maximum value -- if there are duplicates. If you want to guarantee only one row in the result set, use SELECT TOP (1) rather than SELECT TOP (1) WITH TIES.

Upvotes: 1

Snuka
Snuka

Reputation: 11

Not sure which sql syntax you are using, but you could just sort by transactions and select the highest record?

SELECT top 1
   DATENAME(M, OrderDate) as orderMonth, 
   year(OrderDate) as orderYear, 
   count(SalesOrderID) as orderCount
FROM
   Sales.SalesOrderHeader soh
Group by orderMonth, orderYear

order by orderCount asc

Upvotes: 0

Related Questions