Tuttu
Tuttu

Reputation: 137

Struggling to find the right WHERE clause

I'm struggling with a SQL query and I need your help. To be honest, I'm starting to wonder if what I want to achieve can be done the way I did it so far but maybe your collective brains can come up with a better solution than mine and prove me I took the good way at the beginning (Or that I was totally wrong and I should start from scratch).

The Dataset

A row has 4 important fields: ItemID, Item, Priority and Group. Those fields contain the only valuable piece of information, the one that will be displayed in the end.

As I'm using SQL Server 2008, I don't have access to the LAG and LEAD function so I needed to simulate them (Or at least, I did it because I thought it would be useful to me but I'm not so sure anymore). To obtain this result, I used the code from this article from SQLscope which provide you with a LAG and LEAD equivalent that I restrict to a set of row that have the same ItemID. This adds 7 new functional columns to my dataset: Rn, RnDiv2, RnPlus1Div2, PreviousPriority, NextPriority, PreviousGroup and NextGroup.

ItemID      | Item      | Priority  | Group     | Rn        | RnDiv2    | RnPlus1Div2   | PreviousPriority  | NextPriority  | PreviousGroup     | NextGroup 
--------    | -------   | --------  | -------   | -----     | ------    | -----------   | ----------------  | ------------  | -------------     | --------- 
16777397    | Item 1    | 5         | Group 1   | 1         | 0         | 1             | NULL              | NULL          | NULL              | NULL      
16777403    | Item 2    | 5         | Group 2   | 1         | 0         | 1             | NULL              | 5             | NULL              | Group 2   
16777403    | Item 2    | 10        | Group 2   | 2         | 1         | 1             | 5                 | NULL          | Group 2           | NULL      
16777429    | Item 3    | 1000      | Group 3   | 1         | 0         | 1             | NULL              | NULL          | NULL              | NULL      
16777430    | Item 4    | 5         | Group 1   | 1         | 0         | 1             | NULL              | NULL          | NULL              | NULL      
16777454    | Item 5    | 5         | Group 4   | 1         | 0         | 1             | NULL              | NULL          | NULL              | NULL      
16777455    | Item 6    | 5         | Group 5   | 1         | 0         | 1             | NULL              | NULL          | NULL              | NULL      
16777459    | Item 6    | 5         | Group 6   | 1         | 0         | 1             | NULL              | NULL          | NULL              | NULL      
16777468    | Item 8    | 5         | Group 7   | 1         | 0         | 1             | NULL              | NULL          | NULL              | NULL      
16777479    | Item 9    | 5         | Group 4   | 1         | 0         | 1             | NULL              | NULL          | NULL              | NULL      
16777481    | Item 10   | 5         | Group 4   | 1         | 0         | 1             | NULL              | NULL          | NULL              | NULL      
16777496    | Item 11   | 5         | Group 6   | 1         | 0         | 1             | NULL              | NULL          | NULL              | NULL      
16777514    | Item 12   | 5         | Group 4   | 1         | 0         | 1             | NULL              | NULL          | NULL              | NULL      
16777518    | Item 13   | 5         | Group 8   | 1         | 0         | 1             | NULL              | 10            | NULL              | Group 8   
16777518    | Item 13   | 10        | Group 8   | 2         | 1         | 1             | 5                 | 100           | Group 8           | Group 1   
16777518    | Item 13   | 100       | Group 1   | 3         | 1         | 2             | 10                | NULL          | Group 8           | NULL      
16777520    | Item 14   | 5         | Group 9   | 1         | 0         | 1             | NULL              | NULL          | NULL              | NULL      

The problem

The problem in my SQL query is the WHERE clause. I will always filter the rows based on their Group column. But there are some subtlety. Whatever the number of Group an Item is member of, I want it to appear in one and only one Group based on these criteria :

  1. If the Item appears in the same Group more than one time, only the line with the lowest priority should be returned. If an Item appears more than one time in the same Group but with the same Priority, then only the first occurrence should be kept Example: for Item 2, only the line with a Priority value of 5 should be returned;
  2. If the Item appears in the Group but is also present in another Group with a lowest Priority, it shouldn't be displayed. Example: Group 1 is selected as a filter. Item 1 should be displayed but Item 13 shouldn't because it is also present in Group 8 with a lower Priority (Item 13 would appear only in Group 8).

Note that this is just a sample. My real dataset has more than 3000 rows and some other cases are probably possible that I haven't listed in my sample.

Unsuccessful Attempts

Like I said, there is one constant in the WHERE clause and that is the Group filtering.

I will continue to try and find the good WHERE clause but I have the feeling that my whole approach is wrong. I don't see how I could solve the problem when there are more than two entries for the same Item. It is worth noting that this query is used in a SSRS report so I could maybe use custom code to parse the dataset and filter the rows (Working with tables might help solving the issue of Items with more than two entries). But if there's a SQL genius around here with a working solution, that would be great.

PS : if someone knows how to fix this table and can explain it to me, extra cookies for him. :D

Edit :

This is the modified query that I'm using at the moment. I will consider using @Yellowbedwetter's latest query has it seems more robust.

SELECT * 
  FROM (SELECT ItemID,
               Item,
               Priority,
               Group_,
               MIN(Priority) OVER
                 ( PARTITION BY item
                 ) AS interItem_MinPriority
          FROM (SELECT ItemID,
                       Item,
                       Priority,
                       Group_,
                       ROW_NUMBER() OVER
                         ( PARTITION BY Item
                               ORDER BY Priority ASC
                         ) AS interGrp_Rank
                  FROM Test_Table 
               ) AS TMP
         WHERE interGrp_Rank = 1 -- Exclude all records with the same item/group, but higher priority.
       ) AS TMP2
 WHERE Priority = interItem_MinPriority; -- Exclude which aren't the lowest priority across groups.

Upvotes: 1

Views: 95

Answers (3)

Eid Morsy
Eid Morsy

Reputation: 966

If I understand your problem well

about these criteria

  1. If the Item appears in the same Group more than one time, only the line with the lowest priority should be returned. Example: for Item 2, only the line with a Priority value of 5 should be returned;

  2. If the Item appears in the Group but is also present in another Group with a lowest Priority, it shouldn't be displayed. Example: Group 1 is selected as a filter. Item 1 should be displayed but Item 13 shouldn't because it is also present in Group 8 with a lower Priority (Item 13 would appear only in Group 8).

I think we can get the right result by using the minimum priority per item without considering the group of item , because in the two cases above we took the minimum priority of the item.

so the following query might be helpful.(I tested it with your sample data)

with minPriority as
(
select ItemID, Item, Priority , Group_,ROW_NUMBER() over(partition by ItemId order by priority )rn  from Test_table 
)
select * from minPriority where rn=1

Upvotes: 1

Error_2646
Error_2646

Reputation: 3791

If I understand the question correctly this should work

SELECT * 
  FROM (SELECT ItemID,
               Item,
               Priority,
               Group_,
               MIN(Priority) OVER
                 ( PARTITION BY item
                 ) AS interItem_MinPriority
          FROM (SELECT ItemID,
                       Item,
                       Priority,
                       Group_,
                       ROW_NUMBER() OVER
                         ( PARTITION BY Item,
                                        Group_
                               ORDER BY Priority ASC
                         ) AS interGrp_Rank
                  FROM Test_Table 
               ) AS TMP
         WHERE interGrp_Rank = 1 -- Exclude all records with the same item/group, but higher priority.
       ) AS TMP2
 WHERE Priority = interItem_MinPriority; -- Exclude which aren't the lowest priority across groups.

I don't know if your version of SQL Server supports MIN() OVER()..., but if not you should be able to work around that easily enough.

Edit: To handle tie breaks.

WITH TEST_TABLE (ItemID, Item, Priority, Group_) AS 
 (
 SELECT '16777397','Item 1','5','Group 1' UNION 
 SELECT '16777403','Item 2','5','Group 2' UNION 
 SELECT '16777403','Item 2','10','Group 2' UNION 
 SELECT '16777429','Item 3','1000','Group 3' UNION 
 SELECT '16777430','Item 4','5','Group 1' UNION 
 SELECT '16777454','Item 5','5','Group 4' UNION 
 SELECT '16777455','Item 6','5','Group 5' UNION 
 SELECT '16777459','Item 6','5','Group 6' UNION 
 SELECT '16777468','Item 8','5','Group 7' UNION 
 SELECT '16777479','Item 9','5','Group 4' UNION 
 SELECT '16777481','Item 10','5','Group 4' UNION 
 SELECT '16777496','Item 11','5','Group 6' UNION 
 SELECT '16777514','Item 12','5','Group 4' UNION 
 SELECT '16777518','Item 13','5','Group 8' UNION 
 SELECT '16777518','Item 13','10','Group 8' UNION 
 SELECT '16777518','Item 13','100','Group 1' UNION 
 SELECT '16777520','Item 14','5','Group 9'
 ) 

 SELECT ItemID,
        Item,
        Priority,
        Group_
   FROM (SELECT ItemID,
                Item,
                Priority,
                Group_,
                ROW_NUMBER() OVER
                  ( PARTITION BY item
                        ORDER BY Group_ ASC -- or however you want to break the tie
                  ) AS grp_minPriority_TieBreak
           FROM (SELECT ItemID,
                        Item,
                        Priority,
                        Group_,
                        MIN(Priority) OVER
                          ( PARTITION BY item
                          ) AS interItem_MinPriority
                   FROM (SELECT ItemID,
                                Item,
                                Priority,
                                Group_,
                                ROW_NUMBER() OVER
                                  ( PARTITION BY Item,
                                                 Group_
                                        ORDER BY Priority ASC
                                  ) AS interGrp_Rank
                           FROM TEST_TABLE 
                        ) AS TMP
                  WHERE interGrp_Rank = 1 -- Exclude all records with the same item/group, but higher priority.
                ) AS TMP2
          WHERE Priority = interItem_MinPriority -- Exclude which aren't the lowest priority across groups.
       ) AS TMP2
  WHERE grp_minPriority_TieBreak = 1;

Upvotes: 1

Ian
Ian

Reputation: 27

Haven't tried it but something like..`select max(priority) as mp ..... From ... Where group = 'group1' and mp not in (select max(priority).... from ... Where group <> 'group1'

Apologies for the typing, on my phone no glasses :)

Upvotes: 0

Related Questions