Reputation: 1
I'm simply trying to profile a table, using a basic case when statement, to count the number of customers who fall into a band, and further split by another column.
My SQL Server query is returning multiple rows of the same definition however. I would expect 4 rows for the following 2 bands bands, but instead I'm getting multiple rows for all the risk band definitions.
Example of returning multiple rows for A1a and A1b, when would expect 4 rows for this - count of id, risk band (20 variables) and indicator (Y/N):
(No column name) | SRIRiskBands | HighlyLikelyIndicator |
---|---|---|
31423 | A1b | N |
4702 | A1b | Y |
29427 | A1b | N |
4788 | A1b | Y |
27854 | A1b | N |
4543 | A1b | Y |
4349 | A1b | Y |
26029 | A1b | N |
4291 | A1a | Y |
25459 | A1a | N |
24352 | A1a | N |
4100 | A1a | Y |
3875 | A1a | Y |
22689 | A1a | N |
4130 | A1a | Y |
22454 | A1a | N |
20619 | A1a | N |
Code used:
select count(distinct involvedPartyId),
case when MachineLearningScores >= 934.50 THEN 'A1a'
when MachineLearningScores >= 920.50 and MachineLearningScores <= 934.49 THEN 'A1b'
when MachineLearningScores >= 907.50 and MachineLearningScores <= 920.49 THEN 'A1c'
when MachineLearningScores >= 895.50 and MachineLearningScores <= 907.49 THEN 'A1d'
when MachineLearningScores >= 882.50 and MachineLearningScores <= 895.49 THEN 'A1e'
when MachineLearningScores >= 869.50 and MachineLearningScores <= 882.49 THEN 'A1f'
when MachineLearningScores >= 859.50 and MachineLearningScores <= 869.49 THEN 'A1g'
when MachineLearningScores >= 848.50 and MachineLearningScores <= 859.49 THEN 'A1h'
when MachineLearningScores >= 839.50 and MachineLearningScores <= 848.49 THEN 'A1i'
when MachineLearningScores >= 830.50 and MachineLearningScores <= 839.49 THEN 'A2'
when MachineLearningScores >= 809.50 and MachineLearningScores <= 830.49 THEN 'A3'
when MachineLearningScores >= 796.50 and MachineLearningScores <= 809.49 THEN 'A4'
when MachineLearningScores >= 789.50 and MachineLearningScores <= 796.49 THEN 'A5'
when MachineLearningScores >= 767.50 and MachineLearningScores <= 789.49 THEN 'B1'
when MachineLearningScores >= 752.50 and MachineLearningScores <= 767.49 THEN 'B2'
when MachineLearningScores >= 733.50 and MachineLearningScores <= 752.49 THEN 'B3'
when MachineLearningScores >= 724.50 and MachineLearningScores <= 733.49 THEN 'C1'
when MachineLearningScores >= 685.50 and MachineLearningScores <= 724.49 THEN 'D1'
when MachineLearningScores >= 649.50 and MachineLearningScores <= 685.49 THEN 'E1'
when MachineLearningScores < 649.49 THEN 'F1'
ELSE 'No MachineLearningScores'
END AS 'SRIRiskBand', HighlyLikelyIndicator
from MarketingAnalytics.dbo.Ri_PA_Personal
where batchid = (SELECT MAX(BatchId) FROM MarketingAnalytics.dbo.Ri_PA_Personal)
group by MachineLearningScores, HighlyLikelyIndicator
order by MachineLearningScores asc
Any help would be greatly appreciated.
Upvotes: 0
Views: 61
Reputation: 5169
Something like :
WITH
vals AS
(SELECT *
FROM (VALUES (934.50, 'A1a'),
(920.50, 'A1b'),
(907.50, 'A1c'),
(895.50, 'A1d'),
(882.50, 'A1e'),
(869.50, 'A1f'),
(859.50, 'A1g'),
(848.50, 'A1h'),
(839.50, 'A1i'),
(830.50, 'A2'),
(809.50, 'A3'),
(796.50, 'A4'),
(789.50, 'A5'),
(767.50, 'B1'),
(752.50, 'B2'),
(733.50, 'B3'),
(724.50, 'C1'),
(685.50, 'D1'),
(649.50, 'E1'),
(649.49, 'F1')
) AS VR (val, ref )
)
SELECT count(distinct involvedPartyId) AS CNT,
CASE
WHEN LEAD(val) OVER(ORDER BY val) IS NULL THEN 'F1'
WHEN LAG(val) OVER(ORDER BY val) IS NULL THEN 'A1a'
ELSE ref
END AS SRIRiskBandLAG
FROM MarketingAnalytics.dbo.Ri_PA_Personal AS RPP
LEFT OUTER JOIN vals AS v
ON MachineLearningScores < v.val
AND >= MachineLearningScores >= LAG(val) OVER(ORDER BY val)
where batchid = (SELECT MAX(BatchId) FROM MarketingAnalytics.dbo.Ri_PA_Personal)
group by val, ref, HighlyLikelyIndicator
order by val asc
Upvotes: 0
Reputation: 1
This result happens because the "group by" clause is using the original field without treating it. You can solve this by using the "group by" clause by copying the "case when" clause or by using a subselect to group
Solution 1
select count(distinct involvedPartyId),
case when MachineLearningScores >= 934.50 THEN 'A1a'
when MachineLearningScores >= 920.50 and MachineLearningScores <= 934.49 THEN 'A1b'
when MachineLearningScores >= 907.50 and MachineLearningScores <= 920.49 THEN 'A1c'
when MachineLearningScores >= 895.50 and MachineLearningScores <= 907.49 THEN 'A1d'
when MachineLearningScores >= 882.50 and MachineLearningScores <= 895.49 THEN 'A1e'
when MachineLearningScores >= 869.50 and MachineLearningScores <= 882.49 THEN 'A1f'
when MachineLearningScores >= 859.50 and MachineLearningScores <= 869.49 THEN 'A1g'
when MachineLearningScores >= 848.50 and MachineLearningScores <= 859.49 THEN 'A1h'
when MachineLearningScores >= 839.50 and MachineLearningScores <= 848.49 THEN 'A1i'
when MachineLearningScores >= 830.50 and MachineLearningScores <= 839.49 THEN 'A2'
when MachineLearningScores >= 809.50 and MachineLearningScores <= 830.49 THEN 'A3'
when MachineLearningScores >= 796.50 and MachineLearningScores <= 809.49 THEN 'A4'
when MachineLearningScores >= 789.50 and MachineLearningScores <= 796.49 THEN 'A5'
when MachineLearningScores >= 767.50 and MachineLearningScores <= 789.49 THEN 'B1'
when MachineLearningScores >= 752.50 and MachineLearningScores <= 767.49 THEN 'B2'
when MachineLearningScores >= 733.50 and MachineLearningScores <= 752.49 THEN 'B3'
when MachineLearningScores >= 724.50 and MachineLearningScores <= 733.49 THEN 'C1'
when MachineLearningScores >= 685.50 and MachineLearningScores <= 724.49 THEN 'D1'
when MachineLearningScores >= 649.50 and MachineLearningScores <= 685.49 THEN 'E1'
when MachineLearningScores < 649.49 THEN 'F1'
ELSE 'No MachineLearningScores'
END AS 'SRIRiskBand', HighlyLikelyIndicator
from MarketingAnalytics.dbo.Ri_PA_Personal
where batchid = (SELECT MAX(BatchId) FROM MarketingAnalytics.dbo.Ri_PA_Personal)
group by case when MachineLearningScores >= 934.50 THEN 'A1a'
when MachineLearningScores >= 920.50 and MachineLearningScores <= 934.49 THEN 'A1b'
when MachineLearningScores >= 907.50 and MachineLearningScores <= 920.49 THEN 'A1c'
when MachineLearningScores >= 895.50 and MachineLearningScores <= 907.49 THEN 'A1d'
when MachineLearningScores >= 882.50 and MachineLearningScores <= 895.49 THEN 'A1e'
when MachineLearningScores >= 869.50 and MachineLearningScores <= 882.49 THEN 'A1f'
when MachineLearningScores >= 859.50 and MachineLearningScores <= 869.49 THEN 'A1g'
when MachineLearningScores >= 848.50 and MachineLearningScores <= 859.49 THEN 'A1h'
when MachineLearningScores >= 839.50 and MachineLearningScores <= 848.49 THEN 'A1i'
when MachineLearningScores >= 830.50 and MachineLearningScores <= 839.49 THEN 'A2'
when MachineLearningScores >= 809.50 and MachineLearningScores <= 830.49 THEN 'A3'
when MachineLearningScores >= 796.50 and MachineLearningScores <= 809.49 THEN 'A4'
when MachineLearningScores >= 789.50 and MachineLearningScores <= 796.49 THEN 'A5'
when MachineLearningScores >= 767.50 and MachineLearningScores <= 789.49 THEN 'B1'
when MachineLearningScores >= 752.50 and MachineLearningScores <= 767.49 THEN 'B2'
when MachineLearningScores >= 733.50 and MachineLearningScores <= 752.49 THEN 'B3'
when MachineLearningScores >= 724.50 and MachineLearningScores <= 733.49 THEN 'C1'
when MachineLearningScores >= 685.50 and MachineLearningScores <= 724.49 THEN 'D1'
when MachineLearningScores >= 649.50 and MachineLearningScores <= 685.49 THEN 'E1'
when MachineLearningScores < 649.49 THEN 'F1'
ELSE 'No MachineLearningScores'
END AS 'SRIRiskBand', HighlyLikelyIndicator
order by case when MachineLearningScores >= 934.50 THEN 'A1a'
when MachineLearningScores >= 920.50 and MachineLearningScores <= 934.49 THEN 'A1b'
when MachineLearningScores >= 907.50 and MachineLearningScores <= 920.49 THEN 'A1c'
when MachineLearningScores >= 895.50 and MachineLearningScores <= 907.49 THEN 'A1d'
when MachineLearningScores >= 882.50 and MachineLearningScores <= 895.49 THEN 'A1e'
when MachineLearningScores >= 869.50 and MachineLearningScores <= 882.49 THEN 'A1f'
when MachineLearningScores >= 859.50 and MachineLearningScores <= 869.49 THEN 'A1g'
when MachineLearningScores >= 848.50 and MachineLearningScores <= 859.49 THEN 'A1h'
when MachineLearningScores >= 839.50 and MachineLearningScores <= 848.49 THEN 'A1i'
when MachineLearningScores >= 830.50 and MachineLearningScores <= 839.49 THEN 'A2'
when MachineLearningScores >= 809.50 and MachineLearningScores <= 830.49 THEN 'A3'
when MachineLearningScores >= 796.50 and MachineLearningScores <= 809.49 THEN 'A4'
when MachineLearningScores >= 789.50 and MachineLearningScores <= 796.49 THEN 'A5'
when MachineLearningScores >= 767.50 and MachineLearningScores <= 789.49 THEN 'B1'
when MachineLearningScores >= 752.50 and MachineLearningScores <= 767.49 THEN 'B2'
when MachineLearningScores >= 733.50 and MachineLearningScores <= 752.49 THEN 'B3'
when MachineLearningScores >= 724.50 and MachineLearningScores <= 733.49 THEN 'C1'
when MachineLearningScores >= 685.50 and MachineLearningScores <= 724.49 THEN 'D1'
when MachineLearningScores >= 649.50 and MachineLearningScores <= 685.49 THEN 'E1'
when MachineLearningScores < 649.49 THEN 'F1'
ELSE 'No MachineLearningScores'
END AS 'SRIRiskBand' asc
Solution 2
select count(involvedPartyId),
SRIRiskBand,
HighlyLikelyIndicator
from
(
select involvedPartyId,
case when MachineLearningScores >= 934.50 THEN 'A1a'
when MachineLearningScores >= 920.50 and MachineLearningScores <= 934.49 THEN 'A1b'
when MachineLearningScores >= 907.50 and MachineLearningScores <= 920.49 THEN 'A1c'
when MachineLearningScores >= 895.50 and MachineLearningScores <= 907.49 THEN 'A1d'
when MachineLearningScores >= 882.50 and MachineLearningScores <= 895.49 THEN 'A1e'
when MachineLearningScores >= 869.50 and MachineLearningScores <= 882.49 THEN 'A1f'
when MachineLearningScores >= 859.50 and MachineLearningScores <= 869.49 THEN 'A1g'
when MachineLearningScores >= 848.50 and MachineLearningScores <= 859.49 THEN 'A1h'
when MachineLearningScores >= 839.50 and MachineLearningScores <= 848.49 THEN 'A1i'
when MachineLearningScores >= 830.50 and MachineLearningScores <= 839.49 THEN 'A2'
when MachineLearningScores >= 809.50 and MachineLearningScores <= 830.49 THEN 'A3'
when MachineLearningScores >= 796.50 and MachineLearningScores <= 809.49 THEN 'A4'
when MachineLearningScores >= 789.50 and MachineLearningScores <= 796.49 THEN 'A5'
when MachineLearningScores >= 767.50 and MachineLearningScores <= 789.49 THEN 'B1'
when MachineLearningScores >= 752.50 and MachineLearningScores <= 767.49 THEN 'B2'
when MachineLearningScores >= 733.50 and MachineLearningScores <= 752.49 THEN 'B3'
when MachineLearningScores >= 724.50 and MachineLearningScores <= 733.49 THEN 'C1'
when MachineLearningScores >= 685.50 and MachineLearningScores <= 724.49 THEN 'D1'
when MachineLearningScores >= 649.50 and MachineLearningScores <= 685.49 THEN 'E1'
when MachineLearningScores < 649.49 THEN 'F1'
ELSE 'No MachineLearningScores'
END AS 'SRIRiskBand', HighlyLikelyIndicator
from MarketingAnalytics.dbo.Ri_PA_Personal
where batchid = (SELECT MAX(BatchId) FROM MarketingAnalytics.dbo.Ri_PA_Personal
) AS temp
group by SRIRiskBand,
HighlyLikelyIndicator
Upvotes: 0