Reputation: 241
I am trying to write a SQL query that helps me find out the unique amount of "Numbers" that show up in a specific column. Example, in a select * query, the column I want can look like this
Num_Option
9000
9001
9000,9001,9002
8080
8080,8000,8553
I then have another field of "date_available" which is a date/time.
Basically, what want is something where I can group by the "date_available" while combing all the Num_Options on that date, so something like this..
Num_Option date_available
9000,9001,9002,8080 10/22/2020
9000,9002,8080,8000,8553 10/23/2020
I am struggling to figure this out. I have gotten to the possible point of using a python script and matplotlib instead... but I am hoping there is a SQL way of handling this as well.
Upvotes: 0
Views: 1698
Reputation: 4208
first you have to split the strings into multiple rows with something like split_part('9000,9001,9002',',',1)
etc. (use UNION ALL
to append the 2nd number etc.), then group them back by availability date with string_agg
if you don't want to hardcode split_part
part there is an answer here on how to dynamically split strings in Redshift, look for it
Upvotes: 0
Reputation: 1269583
You may just be able to use string_agg()
:
select date_available, string_agg(num_option, ',')
from t
group by date_available;
Upvotes: 0
Reputation: 222432
In Postgres, you can use regexp_split_to_table()
in a lateral join to turn the csv elements to rows, then string_agg()
to aggregate by date:
select string_agg(x.num, ',') num_option, t.date_available
from mytable t
cross join lateral regexp_split_to_table(t.num_option, ',') x(num)
group by date_available
Of course, this assumes that you want to avoid duplicate nums on the same data (otherwise, there is not need to split, you can directly aggregate).
Upvotes: 1