Matt Vukomanovic
Matt Vukomanovic

Reputation: 1440

SQLite Concatenating Column values

There are similar answers for other database types but I haven't found an example for this for SQLite, so I've included the answer that I came up with.

The Problem

Given a table like the following

╔══════════╦══════╗
║ Item     ║ Tag  ║
╠══════════╬══════╣
║ "Item1"  ║ "A"  ║
║ "Item1"  ║ "B"  ║
║ "Item1"  ║ "C"  ║
║ "Item2"  ║ "A"  ║
║ "Item1"  ║ "D"  ║
║ "Item2"  ║ "F"  ║
║ "Item1"  ║ "E"  ║
╚══════════╩══════╝

Create an output of:

╔═══════════╦═════════════╗
║  Item     ║    Tags     ║
╠═══════════╬═════════════╣
║  "Item1"  ║ "A,B,C,D,E" ║
║  "Item2"  ║ "A,F"       ║
╚═══════════╩═════════════╝

Upvotes: 0

Views: 1494

Answers (2)

CL.
CL.

Reputation: 180080

That's what group_concat() is for:

SELECT Item,
       group_concat(Tag) AS Tags
FROM (SELECT Item, Tag
      FROM T
      ORDER BY Item, Tag)
GROUP BY Item;

Upvotes: 4

Matt Vukomanovic
Matt Vukomanovic

Reputation: 1440

The best solution (like has been found for other databases) is to use a recursive CTE to join the table back onto itself.

The Solution

If you just want the solution then here is the final SQL

with recursive
complete as
(select Item, Tag, (select count(*) from T b where a.Tag > b.Tag and b.Item = a.Item) as cnt
from T a
),
summary(item, tags, cnt) as
(
    -- Select the initial seed values
    select item, tag, cnt
      from complete
     where cnt = 0
    union all
    -- Concatenate the next rows values onto the previous rows (this is the recursive part)
    select a.item, a.tags || ',' || b.tag, b.cnt
      from summary a
      join complete b on a.item = b.item and a.cnt + 1 = b.cnt
--  limit 200
)
final( ItemID, tags, cnt) as
(
    select ItemID, tags, max(cnt) from summary -- limit the selected values to the final rows concatenated values
    group by ItemID
)
select ItemID, tags from final
;

The explanation

Read on if you want how it works

First create a test table

--drop table T;
create table T (Item, Tag);

insert into T values('Item1', 'A');
insert into T values('Item1', 'B');
insert into T values('Item1', 'C');
insert into T values('Item2', 'A');
insert into T values('Item1', 'D');
insert into T values('Item2', 'F');
insert into T values('Item1', 'E');

select * from T;

That gives you the flat table

Item    Tag
"Item1" "A"
"Item1" "B"
"Item1" "C"
"Item2" "A"
"Item1" "D"
"Item2" "F"
"Item1" "E"

Then assign incremental numbers to the items tags

select Item,
    Tag,
    (select count(*) from T b where a.Tag > b.Tag and b.Item = a.Item) as cnt
from T a order by 1,3;  -- note we don't need order later on

Which will give a result set of

Item    Tag cnt
"Item1" "A" "0"
"Item1" "B" "1"
"Item1" "C" "2"
"Item1" "D" "3"
"Item1" "E" "4"
"Item2" "A" "0"
"Item2" "F" "1"

Then the recursive CTE

We start with

with recursive

to say we are doing a CTE

Then assign a table with all the data required - here called complete

complete as
(select Item, Tag, (select count(*) from T b where a.Tag > b.Tag and b.Item = a.Item) as cnt
from T a
),

Then define the table and columns must have a place holder for the row number we created before - here the table is called summary and the rownumber is cnt. tags was used to make it more obvious what is happening

summary(item, tags, cnt) as
(

Then select the initial seeding values. In this case that is the first row for each of the items.

    -- Select the initial seed values
    select item, tag, cnt
      from complete
     where cnt = 0

Then we union all back to the current table, Note you can use union however it is recommended to use all since it doesn't check for duplicates and is therefore quicker

    union all
    -- Concatenate the next rows values onto the previous rows (this is the recursive part)
    select a.item, a.tags || ',' || b.tag, b.cnt
      from summary a
      join complete b on a.item = b.item and a.cnt + 1 = b.cnt
--  limit 200
),

Note that in summary's definition it actually joins back to summary. That is why it's a recursive CTE - and why caution must be used when creating them

Then cut out the extra rows which we iterated through to get to the final rows concatenated value by only selecting the max row number for each item

final( Item, tags, cnt) as
(
    select Item, tags, max(cnt) from summary -- limit the selected values to the final rows concatenated values
    group by Item
)

Then select the values from that final cut table

select Item, tags from final
;

That gives the required summarised data

Item    Tags
"Item1" "A,B,C,D,E"
"Item2" "A,F"

Note I originally didn't have the final table in there but just had the code

select item, tags, max(cnt)  from summary
group by item
having max(cnt) -- limit the selected values to the final rows concatenated values

That works for the specific example I gave but not the slightly different stuff I was actually working with, which is why I've added the final table as well.

Note

- The keyword recursive is the standard but not required

- It's generally a good idea to use the LIMIT functionality when developing a recursive CTE so that if you stuff it up you don't crash - it is currently commented out in the code above.

See SQLite WITH explanation for more information about how to do recursive CTE's with SQLite

Upvotes: 1

Related Questions