SQL grouping by distinct values in a multi-value string column

Question

(I want to perform a group-by based on the distinct values in a string column that has multiple values

The said column has a list of strings in a standard format separated by commas. The potential values are only a,b,c,d.

For example the column collection (type: String) contains:

Row 1: ["a","b"]
Row 2: ["b","c"]
Row 3: ["b","c","a"]
Row 4: ["d"]`

The expected output is a count of unique values:

collection | count
a | 2
b | 3
c | 2
d | 1

Dawid Kisielewski · Accepted Answer

For all the below i used this table:

create table tmp (
 id INT auto_increment,
 test VARCHAR(255),
 PRIMARY KEY (id)
);

insert into tmp (test) values 
    ("a,b"),
    ("b,c"),
    ("b,c,a"),
    ("d")
;

If the possible values are only a,b,c,d you can try one of this: Tke note that this will only works if you have not so similar values like test and test_new, because then the test would be joined also with all test_new rows and the count would not match

select collection, COUNT(*) as count from tmp JOIN (
    select CONCAT("%", tb.collection, "%") as like_collection, collection from (
        select "a" COLLATE utf8_general_ci as collection
        union select "b" COLLATE utf8_general_ci as collection
        union select "c" COLLATE utf8_general_ci as collection
        union select "d" COLLATE utf8_general_ci as collection
    ) tb
) tb1 
ON tmp.test LIKE tb1.like_collection
GROUP BY tb1.collection;

Which will give you the result you want

collection | count
    a      |   2
    b      |   3
    c      |   2
    d      |   1

or you can try this one

SELECT 
   (SELECT COUNT(*) FROM tmp WHERE test LIKE '%a%') as a_count,
   (SELECT COUNT(*) FROM tmp WHERE test LIKE '%b%') as b_count,
   (SELECT COUNT(*) FROM tmp WHERE test LIKE '%c%') as c_count,
   (SELECT COUNT(*) FROM tmp WHERE test LIKE '%d%') as d_count
;

The result would be like this

a_count | b_count | c_count | d_count
2       |    3    |   2     |   1

SQL grouping by distinct values in a multi-value string column

Answers (2)

Related Questions