Reputation: 145
I have the following table:
ID | Number | Size |
---|---|---|
7 | 1 | 1.5 |
7 | 2 | 1.5 |
8 | 1 | 1.625 |
8 | 2 | 1.03125 |
8 | 3 | 1.03125 |
8 | 4 | 1.03125 |
8 | 5 | 1.625 |
8 | 6 | 1 |
8 | 7 | 1.625 |
8 | 8 | 1.625 |
8 | 9 | 1.625 |
9 | 1 | 1 |
9 | 2 | 2 |
9 | 3 | 3 |
9 | 4 | 4 |
9 | 5 | 1 |
I would like to create a column with a single string value for each ID that groups the size, but only when the values are consecutive based on the number, and includes the number of matching sizes.
For example, for ID=8, I would like something like this:
1.625 x 1.03125 (x3) x 1.625 x 1 x 1.625 (x3)
Please note that the number could increment to any integer value and the size could be any positive number, up to 7 decimal places.
As I've been thinking through it, these are the steps I've considered so far:
ID | Number | Size | Group |
---|---|---|---|
7 | 1 | 1.5 | 1 |
7 | 2 | 1.5 | 1 |
8 | 1 | 1.625 | 1 |
8 | 2 | 1.03125 | 2 |
8 | 3 | 1.03125 | 2 |
8 | 4 | 1.03125 | 2 |
8 | 5 | 1.625 | 3 |
8 | 6 | 1 | 4 |
8 | 7 | 1.625 | 5 |
8 | 8 | 1.625 | 5 |
8 | 9 | 1.625 | 5 |
9 | 1 | 1 | 1 |
9 | 2 | 2 | 2 |
9 | 3 | 3 | 3 |
9 | 4 | 4 | 4 |
9 | 5 | 1 | 5 |
ID | Number | Size | Group | Quantity |
---|---|---|---|---|
7 | 1 | 1.5 | 1 | 2 |
8 | 1 | 1.625 | 1 | 1 |
8 | 2 | 1.03125 | 2 | 3 |
8 | 5 | 1.625 | 3 | 1 |
8 | 6 | 1 | 4 | 1 |
8 | 7 | 1.625 | 5 | 3 |
9 | 1 | 1 | 1 | 1 |
9 | 2 | 2 | 2 | 1 |
9 | 3 | 3 | 3 | 1 |
9 | 4 | 4 | 4 | 1 |
9 | 5 | 1 | 5 | 1 |
Finally, potentially using a combination of the STRING_AGG() and CONCAT() functions, I would put the data into the following format:
ID | Size |
---|---|
7 | 1.5 (x2) |
8 | 1.625 (x1) x 1.03125 (x3) x 1.625 (x1) x 1 (x1) x 1.625 (x3) |
9 | 1 (x1) x 2 (x1) x 3 (x1) x 4 (x1) x 1 (x1) |
And, preferably, I would omit the " (x1)", such that the final form looks like this:
ID | Size |
---|---|
7 | 1.5 (x2) |
8 | 1.625 x 1.03125 (x3) x 1.625 x 1 x 1.625 (x3) |
9 | 1 x 2 x 3 x 4 x 1 |
It's kind of a lot. But in the end, I'm hoping to turn the data from these tables into a more readable format for users, but 1) I wasn't exactly sure how to do this, and 2) Given potentially multiple routes for accomplishing this, I wasn't sure what the most performant option was.
Any thoughts?
P.S. I am open to any methods, it certainly doesn't have to follow the train of thought I included here.
*Edited to add a new dataset (ID=9)
Upvotes: 1
Views: 112
Reputation: 9191
You have a gaps and islands problem, one option is to use a difference between two row_number
s to define the required groups, try the following using SQL Server syntax:
with cte1 as -- Step 1: using the difference between two row_numbers approach, create groups for consecutive similar values of Size.
(
select *,
row_number() over (partition by id order by number) -
row_number() over (partition by id, size order by number) grp
from table_name
),
cte2 as -- Step 2: get the counts for each group defined in the previous step
(
select id, min(number) number, size, grp, count(*) cnt
from cte1
group by id, size, grp
)
-- Step 3: use string_agg and concat functions to get the desired format
select id,
string_agg
(-- use a case expression to not include (1x) when count = 1
case when cnt > 1 then concat(size, ' (x', cnt, ')') else cast(size as varchar(20)) end, ' x '
) within group (order by number) Size
from cte2
group by id
order by id
Upvotes: 1