Kirill Pashkov
Kirill Pashkov

Reputation: 3236

TSQL filtering by character match

How to filter out number of count matches without building new user functions(i.e. you can use built-in functions) on a given data?

The requirement is to get rows with the gw column numbers appearing the same amount of times or if there is different set of amounts their number must match the other ones count. I.e. it could be all 1 like the Sandy's or it could be Don since it has '1' two times and '2' two times as well. Voland would not meet the requirements since he has '1' two times but only once '2' and etc. You don't want to count '0' at all.

login      gw1 gw2 gw3 gw4 gw5
Peter       1   0   1   0   0
Sandy       1   1   1   1   0
Voland      1   0   1   2   0
Don         1   2   0   1   2

Diserid output is:

login      gw1 gw2 gw3 gw4 gw5
Peter       1   0   1   0   0
Sandy       1   1   1   1   0
Don         1   2   0   1   2

Values could be any positive number of times. To match the criteria values also has to be at least twice total. I.e. 1 2 3 4 0 is not OK. since every value appears only once. 1 1 0 3 3 is a match.

Upvotes: 3

Views: 77

Answers (3)

Christian Barron
Christian Barron

Reputation: 2755

I know I'm late to the party, I can't type as fast as some and I think I arrived about 40 minutes late but since I done it, I thought I'd share it anyway. My method used unpivot and pivot to achieve the result:

Select * 
from foobar f1
where exists 
(Select * from
(Select login_, Case when [1] = 0 then null else [1] % 2 end Val1, Case when [2] = 0 then null else [2] % 2 end Val2, 
Case when [3] = 0 then null else [3] % 2 end Val3, Case when [4] = 0 then null else [4] % 2 end Val4, Case when [5] = 0 then null else [5] % 2 end Val5
from
(Select *
from
(select * from foobar) src
UNPIVOT
(value for amount in (gw1, gw2, gw3, gw4, gw5)) unpvt) src2
PIVOT
(count(amount) for value in ([1],[2],[3],[4],[5])) as pvt) res
Where 0 in (Val1,Val2, Val3, Val4, Val5) and not exists (select * from foobar where 1 in (Val1, Val2, Val3, Val4, Val5)) and login_ = f1.login_)

and here is the fiddle: http://www.sqlfiddle.com/#!6/b78f8/1/0

Upvotes: 1

StuartLC
StuartLC

Reputation: 107247

I think this logic is correct, viz

Find the rows where there is at least one even number of equal gws and no instance of odd number of gws. Zeroes are excluded

Find the rows where there is at least one number of equal gws and no instance of a single gws. Zeroes are excluded.

Unpivot is used to simplify reasoning over the gwx columns, and CTEs are used to prevent repetition.

WITH unpvt AS
(
    SELECT * 
    FROM MyTable
    UNPIVOT 
    (
        gwvalue
        for z in (gw1, gw2, gw3, gw4, gw5)
    ) x
),
grp AS
(
    SELECT [login], gwvalue, COUNT(gwvalue) gwCount
    FROM unpvt 
    WHERE gwvalue > 0
    GROUP BY [login], gwvalue
)
SELECT 
* 
FROM MyTable mt
WHERE EXISTS
(
    SELECT 1
    FROM grp g
    WHERE g.[login] = mt.[login]
    AND gwCount > 1
)
AND NOT EXISTS 
(
    SELECT 1
    FROM grp g
    WHERE g.[login] = mt.[login]
    AND gwCount = 1
);

SqlFiddle here

Upvotes: 0

Felix Pamittan
Felix Pamittan

Reputation: 31879

SQL Fiddle

WITH Cte(login, gw) AS(
    SELECT login, gw1 FROM TestData WHERE gw1 > 0 UNION ALL
    SELECT login, gw2 FROM TestData WHERE gw2 > 0 UNION ALL
    SELECT login, gw3 FROM TestData WHERE gw3 > 0 UNION ALL
    SELECT login, gw4 FROM TestData WHERE gw4 > 0 UNION ALL
    SELECT login, gw5 FROM TestData WHERE gw5 > 0
),
CteCountByLoginGw AS(
    SELECT
        login, gw, COUNT(*) AS cc
    FROM Cte
    GROUP BY login, gw
),
CteFinal AS(
    SELECT login
    FROM CteCountByLoginGw c
    GROUP BY login
    HAVING
        MAX(cc) > 1
        AND COUNT(DISTINCT gw) = (
                SELECT COUNT(*) 
                FROM CteCountByLoginGw 
                WHERE 
                    c.login = login 
                    AND cc = MAX(c.cc)
        )
)
SELECT t.*
FROM CteFinal c
INNER JOIN TestData t
    ON t.login = c.login

First you unpivot the table without including gw that are equal to 0. The result (CTE) is:

login      gw
---------- -----------
Peter      1
Sandy      1
Voland     1
Don        1
Sandy      1
Don        2
Peter      1
Sandy      1
Voland     1
Sandy      1
Voland     2
Don        1
Don        2

Then, you perform a COUNT(*) GROUP BY login, gw. The result would be (CteCountByLoginGw):

login      gw          cc
---------- ----------- -----------
Don        1           2
Peter      1           2
Sandy      1           4
Voland     1           2
Don        2           2
Voland     2           1

Finally, only get those login whose max(cc) is greater 1. This is to eliminate rows like 1,2,3,4,0. And login whose unique gw is the same the max(cc). This is to make sure that the occurrence of a gw column is the same as others:

login      gw1         gw2         gw3         gw4         gw5
---------- ----------- ----------- ----------- ----------- -----------
Peter      1           0           1           0           0
Sandy      1           1           1           1           0
Don        1           2           0           1           2

Upvotes: 3

Related Questions