Reputation: 146
I have the following query:
SELECT
D.[Year] AS [Year]
, D.[Month] AS [Month]
, CASE
WHEN f.Dept IN ('XSD') THEN 'Marketing'
ELSE f.Dept
END AS DeptS
, COUNT(DISTINCT f.OrderNo) AS CountOrders
FROM Sales.LocalOrders AS l WITH
INNER JOIN Sales.FiscalOrders AS f
ON l.ORDER_NUMBER = f.OrderNo
INNER JOIN Dimensions.Date_Dim AS D
ON CAST(D.[Date] AS DATE) = CAST(f.OrderDate AS DATE)
WHERE YEAR(f.OrderDate) = 2019
AND f.Dept IN ('XSD', 'PPM', 'XPP')
GROUP BY
D.[Year]
, D.[Month]
, f.Dept
ORDER BY
D.[Year] ASC
, D.[Month] ASC
I get the following result the ORDER BY
isn't giving the right result with Month column as we can see it is not ordered:
Year Month Depts CountOrders
2019 1 XSD 200
2019 10 PPM 290
2019 10 XPP 150
2019 2 XSD 200
2019 3 XPP 300
The expected output:
Year Month Depts CountOrders
2019 1 XSD 200
2019 2 XSD 200
2019 3 XPP 300
2019 10 PPM 290
2019 10 XPP 150
Upvotes: 2
Views: 314
Reputation: 1269563
The simplest solution is:
ORDER BY MIN(D.DATE)
or:
ORDER BY MIN(f.ORDER_DATE)
Fiddling with the year
and month
columns is totally unnecessary when you have a date column that is available.
Upvotes: 1
Reputation: 1497
It is ordered by month, as your D.[Month]
is treated like a text string in the ORDER BY
clause.
You could do one of two things to fix this:
ORDER BY
clause that will be recognized as representing a monthYou can correct this in your code by quickly changing the ORDER BY
clause to analyze those columns as though they are numbers, which is done by converting ("casting") them to an integer data type like this:
ORDER BY
CAST(D.[Year] AS INT) ASC
,CAST(D.[Month] AS INT) ASC
This will correct your unexpected query results, but does not address the root cause, which is your underlying data (more on that below).
The root cause of your issue is how your underlying data is stored and/or surfaced.
Your Month
seems to be appearing as a default data type (VarChar), rather than something more specifically suited to a month or date.
If you administer or have access to or control over the database, it is a good idea to consider correcting this.
In considering this, be mindful of potential context and change management issues, including:
None of these issues are a good excuse to leave something set up incorrectly forever, which will likely compound the issue and introduce others. However, that is only part of the story.
The appropriate approach (correct it, or leave it) will depend on your situation. In a perfect textbook world, you'd correct it. In your world, you will have to decide.
The above solution is a bit of a quick and nasty way to force your query to work.
The fact that the solution CAST
s late in the query syntax, after the results have been selected and filtered, hints that is not the most elegant way to achieve this.
Ideally you can convert data types as early as possible in the process:
In your case, your GROUP BY and ORDER BY are both using columns that look to be redundant data from the original query results, that is, you are getting a DATE and a MONTH and a YEAR. Ideally you would just get a DATE and then use the MONTH or YEAR from that date. Your issue is your dates are not actually dates (see "underlying data" above), which:
INNER JOIN
line ON CAST(D.[Date] AS DATE) = CAST(f.OrderDate AS DATE)
(likely to minimise issues with the join)ORDER BY
)You could consider ignoring D.[month] and use the MONTH
DATEPART
computed from DATE
, which would avoid the need to use CAST
in the ORDER BY
clause.
In your instance, this approach is a middle ground. The quick fix is included at the top of this answer, and the best fix is to correct the underlying data. This last section considers optimizing the quick fix, but does not correct the underlying issue. It is only mentioned for awareness and to avoid promoting the use of CAST
in an ORDER BY
clause as the most legitimate way of addressing your issue with good clean query syntax.
There are also potential performance tradeoffs between how many columns you select that you don't need (e.g. all of the ones in D
?), whether to compute the month from the date or a seperate month column, whether to cast to date before filtering, etc. These are beyond the scope of this solution.
So:
Upvotes: 5
Reputation: 95554
The real problem is your object Dimensions.Date_Dim
here. As you are simply ordering on the value of D.[Year]
and D.[Month]
without manipulating the values at all, this means the object is severely flawed; you are storing numerical data as a varchar
. varchar
, and numerical data types are completely different. For example 2
is less than 10
but '2'
is greater than '10'
; because '2'
is greater than '1'
, so therefore it must also be greater than '10'
.
The real solution, therefore, is fixing your object. Assuming that both Month
and Year
are incorrectly stored as a varchar
, don't have any non-integer values (another and different flaw if so), and not a computed column then you could just do:
ALTER TABLE Dimensions.Date_Dim ALTER COLUMN [Year] int NOT NULL;
ALTER TABLE Dimensions.Date_Dim ALTER COLUMN [Month] int NOT NULL;
You could, however, also make the columns a PERSISTED
computed column, which might well be easier, in my opinion, as DATEPART
already returns a strongly typed int
value.
ALTER TABLE dbo.Date_Dim DROP COLUMN [Month];
ALTER TABLE dbo.Date_Dim ADD [Month] AS DATEPART(MONTH,[Date]) PERSISTED;
Of course, for both solutions, you'll need to (first) DROP
and (afterwards) reCREATE
any indexes and constraints on the columns.
Upvotes: 2
Reputation: 1
A very common issue when you store numerical data as a varchar/nvarchar. Try to cast Year and Month to INT.
ORDER BY
CAST(D.[Year] AS INT) ASC
,CAST(D.[Month] AS INT) ASC
If you try using the <, > and BETWEEN operators, you will get some really "weird" results.
Upvotes: 0
Reputation: 614
As long as your "Month" is always 1-12, you can use
SELECT ..., TRY_CAST(D.[Month] AS INT) AS [Month],...
ORDER BY TRY_CAST(D.[Month] AS INT)
Upvotes: 1