Reputation: 4453

How to count the number of occurences for a specific value from a range of columns using T-SQL?

I have a table (t1) as shown below that contains single-character values in all columns named v%:

Id    v1  v2   v3  v4  v5  v6
100    O   O    E   O  E   E
103    E   O    E   E  O   E
...

I need to count the number of occurrences for "O" and "E" and display them in 2 additional columns. How can I do this?

Expected Output:

Id    v1  v2   v3  v4  v5  v6  O   E
100    O   O    E   O  E   E   3   3
103    E   O    E   E  O   E   2   4
...

Upvotes: 0

Answers (5)

Gottfried Lesigang

Reputation: 67311

If you need a fully generic approach (at least you state that you need all columns LIKE 'v%') you might use a XML hack.

This approach does not need to know the actual columns:

DECLARE @tbl TABLE(Id INT,v1 CHAR(1),v2 CHAR(1),v3 CHAR(1),v4 CHAR(1),v5 CHAR(1),v6 CHAR(1))
INSERT INTO @tbl 
VALUES(100,'O','O','E','O','E','E'),
      (103,'E','O','E','E','O','E');

--the query (updated: removed the correlated sub-query as it was not needed)

SELECT t1.*
      ,A.x.value('count(/row/*[substring(local-name(),1,1)="v" and text()[1]="O"])','int') CountOfO
      ,A.x.value('count(/row/*[substring(local-name(),1,1)="v" and text()[1]="E"])','int') CountOfE
FROM @tbl t1
CROSS APPLY(SELECT t1.* FOR XML RAW,ELEMENTS XSINIL,TYPE) A(x);

The result

Id  v1  v2  v3  v4  v5  v6  CountOfO    CountOfE
100 O   O   E   O   E   E       3           3
103 E   O   E   E   O   E       2           4

The idea in short:

The APPLY will return a typed XML. Using RAW, ELEMENTS XSNINIL, TYPE) ensures a predictable format, you can use literally any table as input (well, size matters :-) ).

The counting works with XQuery's function count() in conjunction with a XQuery predicate. This will filter for

any element below <row>,
when its name starts with "v"
and its content equals "O" (or "E")

Hint: You can use sql:variable() to introduce the O or the E (or even the "v") as a variable.

And another Hint: XML hacks allow for very generic approaches normally not possible with TSQL, but this won't be fast...

And the most important hint: This questions points to a bad design. Your "v" columns should probably live within a related side table.

Upvotes: 0

Ilyes

Reputation: 14928

You can do the following only if all values are single character

select *, len(v) - len(replace(v, 'o', '')) OCount,
          len(v) - len(replace(v, 'e', '')) ECount   
from
(
  values
  (100,    'O',   'O',   'E',   'O',  'E',   'E'),
  (103,    'E',   'O',   'E',   'E',  'O',   'E')
) t(id, v1, v2, v3, v4, v5, v6)
cross apply
(
  values
  (concat(v1,v2,v3,v4,v5,v6))
) tt(v)

Upvotes: 1

Gordon Linoff

Reputation: 1270401

I would use apply and aggregation:

select t.*, v.*
from t cross apply
     (select sum(case when v.v = 'O' then 1 else 0 end) as num_os,
             sum(case when v.v = 'E' then 1 else 0 end) as num_es
      from (values (t.v1), (t.v2), (t.v3), (t.v4), (t.v5), (t.v6)) v(v)
     ) v;

Upvotes: 2

Olivier Jacot-Descombes

Reputation: 112537

Your data is not normalized because you have numbered columns instead of having a row for each vn. Therefore you cannot use aggregate functions but need to treat every column individually.

SELECT
    Id,
    v1, v2, v3, v4, v5, v6,
    CASE v1 WHEN 'O' THEN 1 ELSE 0 END +
    CASE v2 WHEN 'O' THEN 1 ELSE 0 END +
    CASE v3 WHEN 'O' THEN 1 ELSE 0 END +
    CASE v4 WHEN 'O' THEN 1 ELSE 0 END +
    CASE v5 WHEN 'O' THEN 1 ELSE 0 END +
    CASE v6 WHEN 'O' THEN 1 ELSE 0 END AS O,
    CASE v1 WHEN 'E' THEN 1 ELSE 0 END +
    CASE v2 WHEN 'E' THEN 1 ELSE 0 END +
    CASE v3 WHEN 'E' THEN 1 ELSE 0 END +
    CASE v4 WHEN 'E' THEN 1 ELSE 0 END +
    CASE v5 WHEN 'E' THEN 1 ELSE 0 END +
    CASE v6 WHEN 'E' THEN 1 ELSE 0 END AS E
FROM t1;

If you had those v in rows, you could do something like

SELECT
    Id,
    SUM(CASE v WHEN 'O' THEN 1 ELSE 0 END) AS O,
    SUM(CASE v WHEN 'E' THEN 1 ELSE 0 END) AS E
FROM t1
GROUP BY Id;

Upvotes: 0

Thom A

Reputation: 95830

The "easiest" way is to unpivot them, and then use a condition aggregate.

WITH YourTable AS(
    SELECT *
    FROM (VALUES(100,'O','O','E','O','E','E'),
                (103,'E','O','E','E','O','E'))V(Id,v1,v2,v3,v4,v5,v6))
SELECT YT.Id,
       YT.v1,
       YT.v2,
       YT.v3,
       YT.v4,
       YT.v5,
       YT.v6,
       (SELECT COUNT(CASE u.v WHEN 'O' THEN 1 END)
        FROM (VALUES(YT.v1),(YT.v2),(YT.v3),(YT.v4),(YT.v5),(YT.v6))u(v)) AS O,
       (SELECT COUNT(CASE u.v WHEN 'E' THEN 1 END)
        FROM (VALUES(YT.v1),(YT.v2),(YT.v3),(YT.v4),(YT.v5),(YT.v6))u(v)) AS E
FROM YourTable YT

Upvotes: 2

How to count the number of occurences for a specific value from a range of columns using T-SQL?

Answers (5)

Related Questions