Reputation: 137
I'm struggling with a SQL query and I need your help. To be honest, I'm starting to wonder if what I want to achieve can be done the way I did it so far but maybe your collective brains can come up with a better solution than mine and prove me I took the good way at the beginning (Or that I was totally wrong and I should start from scratch).
A row has 4 important fields: ItemID
, Item
, Priority
and Group
. Those fields contain the only valuable piece of information, the one that will be displayed in the end.
As I'm using SQL Server 2008, I don't have access to the LAG
and LEAD
function so I needed to simulate them (Or at least, I did it because I thought it would be useful to me but I'm not so sure anymore). To obtain this result, I used the code from this article from SQLscope which provide you with a LAG
and LEAD
equivalent that I restrict to a set of row that have the same ItemID
. This adds 7 new functional columns to my dataset: Rn
, RnDiv2
, RnPlus1Div2
, PreviousPriority
, NextPriority
, PreviousGroup
and NextGroup
.
ItemID | Item | Priority | Group | Rn | RnDiv2 | RnPlus1Div2 | PreviousPriority | NextPriority | PreviousGroup | NextGroup
-------- | ------- | -------- | ------- | ----- | ------ | ----------- | ---------------- | ------------ | ------------- | ---------
16777397 | Item 1 | 5 | Group 1 | 1 | 0 | 1 | NULL | NULL | NULL | NULL
16777403 | Item 2 | 5 | Group 2 | 1 | 0 | 1 | NULL | 5 | NULL | Group 2
16777403 | Item 2 | 10 | Group 2 | 2 | 1 | 1 | 5 | NULL | Group 2 | NULL
16777429 | Item 3 | 1000 | Group 3 | 1 | 0 | 1 | NULL | NULL | NULL | NULL
16777430 | Item 4 | 5 | Group 1 | 1 | 0 | 1 | NULL | NULL | NULL | NULL
16777454 | Item 5 | 5 | Group 4 | 1 | 0 | 1 | NULL | NULL | NULL | NULL
16777455 | Item 6 | 5 | Group 5 | 1 | 0 | 1 | NULL | NULL | NULL | NULL
16777459 | Item 6 | 5 | Group 6 | 1 | 0 | 1 | NULL | NULL | NULL | NULL
16777468 | Item 8 | 5 | Group 7 | 1 | 0 | 1 | NULL | NULL | NULL | NULL
16777479 | Item 9 | 5 | Group 4 | 1 | 0 | 1 | NULL | NULL | NULL | NULL
16777481 | Item 10 | 5 | Group 4 | 1 | 0 | 1 | NULL | NULL | NULL | NULL
16777496 | Item 11 | 5 | Group 6 | 1 | 0 | 1 | NULL | NULL | NULL | NULL
16777514 | Item 12 | 5 | Group 4 | 1 | 0 | 1 | NULL | NULL | NULL | NULL
16777518 | Item 13 | 5 | Group 8 | 1 | 0 | 1 | NULL | 10 | NULL | Group 8
16777518 | Item 13 | 10 | Group 8 | 2 | 1 | 1 | 5 | 100 | Group 8 | Group 1
16777518 | Item 13 | 100 | Group 1 | 3 | 1 | 2 | 10 | NULL | Group 8 | NULL
16777520 | Item 14 | 5 | Group 9 | 1 | 0 | 1 | NULL | NULL | NULL | NULL
The problem in my SQL query is the WHERE
clause. I will always filter the rows based on their Group
column. But there are some subtlety. Whatever the number of Group
an Item
is member of, I want it to appear in one and only one Group
based on these criteria :
Item
appears in the same Group
more than one time, only the line with the lowest priority
should be returned. If an Item
appears more than one time in the same Group
but with the same Priority
, then only the first occurrence should be kept Example: for Item 2
, only the line with a Priority
value of 5 should be returned;Item
appears in the Group
but is also present in another Group
with a lowest Priority
, it shouldn't be displayed. Example: Group 1
is selected as a filter. Item 1
should be displayed but Item 13
shouldn't because it is also present in Group 8
with a lower Priority
(Item 13
would appear only in Group 8
).Note that this is just a sample. My real dataset has more than 3000 rows and some other cases are probably possible that I haven't listed in my sample.
Like I said, there is one constant in the WHERE
clause and that is the Group
filtering.
WHERE Group = 'Group 1'
and I need to have something a bit more complex. WHERE Group = 'Group 1' AND (Group = NextGroup AND Priority < NextPriority)
. That works well in the case of an Item
that is in no more that 2 groups. But for Item 13
, it would return the first two rows. And if I add something like AND NOT (CorrectedPriority >= PreviousPriority)
to the WHERE
clause, I get no results at all.(SiteName <> PreviousSiteName AND CorrectedPriority >= PreviousPriority)
. The problem is that I will never return a line where Rn = 1
because PreviousSiteName
will be equal to NULL
. Adding a check on NULL
doesn't work either. I must have bee tired when trying this particular clause because it's complete garbage.I will continue to try and find the good WHERE
clause but I have the feeling that my whole approach is wrong. I don't see how I could solve the problem when there are more than two entries for the same Item
. It is worth noting that this query is used in a SSRS report so I could maybe use custom code to parse the dataset and filter the rows (Working with tables might help solving the issue of Item
s with more than two entries). But if there's a SQL genius around here with a working solution, that would be great.
PS : if someone knows how to fix this table and can explain it to me, extra cookies for him. :D
This is the modified query that I'm using at the moment. I will consider using @Yellowbedwetter's latest query has it seems more robust.
SELECT *
FROM (SELECT ItemID,
Item,
Priority,
Group_,
MIN(Priority) OVER
( PARTITION BY item
) AS interItem_MinPriority
FROM (SELECT ItemID,
Item,
Priority,
Group_,
ROW_NUMBER() OVER
( PARTITION BY Item
ORDER BY Priority ASC
) AS interGrp_Rank
FROM Test_Table
) AS TMP
WHERE interGrp_Rank = 1 -- Exclude all records with the same item/group, but higher priority.
) AS TMP2
WHERE Priority = interItem_MinPriority; -- Exclude which aren't the lowest priority across groups.
Upvotes: 1
Views: 95
Reputation: 966
If I understand your problem well
about these criteria
If the Item appears in the same Group more than one time, only the line with the lowest priority should be returned. Example: for Item 2, only the line with a Priority value of 5 should be returned;
If the Item appears in the Group but is also present in another Group with a lowest Priority, it shouldn't be displayed. Example: Group 1 is selected as a filter. Item 1 should be displayed but Item 13 shouldn't because it is also present in Group 8 with a lower Priority (Item 13 would appear only in Group 8).
I think we can get the right result by using the minimum priority per item without considering the group of item , because in the two cases above we took the minimum priority of the item.
so the following query might be helpful.(I tested it with your sample data)
with minPriority as
(
select ItemID, Item, Priority , Group_,ROW_NUMBER() over(partition by ItemId order by priority )rn from Test_table
)
select * from minPriority where rn=1
Upvotes: 1
Reputation: 3791
If I understand the question correctly this should work
SELECT *
FROM (SELECT ItemID,
Item,
Priority,
Group_,
MIN(Priority) OVER
( PARTITION BY item
) AS interItem_MinPriority
FROM (SELECT ItemID,
Item,
Priority,
Group_,
ROW_NUMBER() OVER
( PARTITION BY Item,
Group_
ORDER BY Priority ASC
) AS interGrp_Rank
FROM Test_Table
) AS TMP
WHERE interGrp_Rank = 1 -- Exclude all records with the same item/group, but higher priority.
) AS TMP2
WHERE Priority = interItem_MinPriority; -- Exclude which aren't the lowest priority across groups.
I don't know if your version of SQL Server supports MIN() OVER()..., but if not you should be able to work around that easily enough.
Edit: To handle tie breaks.
WITH TEST_TABLE (ItemID, Item, Priority, Group_) AS
(
SELECT '16777397','Item 1','5','Group 1' UNION
SELECT '16777403','Item 2','5','Group 2' UNION
SELECT '16777403','Item 2','10','Group 2' UNION
SELECT '16777429','Item 3','1000','Group 3' UNION
SELECT '16777430','Item 4','5','Group 1' UNION
SELECT '16777454','Item 5','5','Group 4' UNION
SELECT '16777455','Item 6','5','Group 5' UNION
SELECT '16777459','Item 6','5','Group 6' UNION
SELECT '16777468','Item 8','5','Group 7' UNION
SELECT '16777479','Item 9','5','Group 4' UNION
SELECT '16777481','Item 10','5','Group 4' UNION
SELECT '16777496','Item 11','5','Group 6' UNION
SELECT '16777514','Item 12','5','Group 4' UNION
SELECT '16777518','Item 13','5','Group 8' UNION
SELECT '16777518','Item 13','10','Group 8' UNION
SELECT '16777518','Item 13','100','Group 1' UNION
SELECT '16777520','Item 14','5','Group 9'
)
SELECT ItemID,
Item,
Priority,
Group_
FROM (SELECT ItemID,
Item,
Priority,
Group_,
ROW_NUMBER() OVER
( PARTITION BY item
ORDER BY Group_ ASC -- or however you want to break the tie
) AS grp_minPriority_TieBreak
FROM (SELECT ItemID,
Item,
Priority,
Group_,
MIN(Priority) OVER
( PARTITION BY item
) AS interItem_MinPriority
FROM (SELECT ItemID,
Item,
Priority,
Group_,
ROW_NUMBER() OVER
( PARTITION BY Item,
Group_
ORDER BY Priority ASC
) AS interGrp_Rank
FROM TEST_TABLE
) AS TMP
WHERE interGrp_Rank = 1 -- Exclude all records with the same item/group, but higher priority.
) AS TMP2
WHERE Priority = interItem_MinPriority -- Exclude which aren't the lowest priority across groups.
) AS TMP2
WHERE grp_minPriority_TieBreak = 1;
Upvotes: 1
Reputation: 27
Haven't tried it but something like..`select max(priority) as mp ..... From ... Where group = 'group1' and mp not in (select max(priority).... from ... Where group <> 'group1'
Apologies for the typing, on my phone no glasses :)
Upvotes: 0