Redshift get Word patterns from a String

Question

I have a string in a column 'ABC1,ABC2,WWW1,WWW2,ABC3,WWW3,WWW4,ABC4' I need to extract words starting with ABC alone . The expected results should be

ABC1,ABC2,ABC3,ABC4

I tried the bellow but didnt work

select regexp_substr(split_part('ABC1,ABC2,WWW1,WWW2,ABC3,WWW3,WWW4,ABC4',',',1),'ABC[^"]*')

I am getting a the First occurrence only : as ABC1 . How to get all occurrence of ABC*

Yosi Dahari · Accepted Answer

You are getting only one match since the functions you used (split_part,regexp_substr) returns the n'th element of the match.

The most naive approach to solve this:

select regexp_substr(split_part('ABC1,ABC2,WWW1,WWW2,ABC3,WWW3,WWW4,ABC4',',',1),'ABC[^"]*')
union all
select regexp_substr(split_part('ABC1,ABC2,WWW1,WWW2,ABC3,WWW3,WWW4,ABC4',',',2),'ABC[^"]*')
union all
...
select regexp_substr(split_part('ABC1,ABC2,WWW1,WWW2,ABC3,WWW3,WWW4,ABC4',',',20),'ABC[^"]*')

To improve this, you would need to first transform the comma delimited value to multiple rows, and then using the regex. This can be done in various ways, see this post for more details.

Result: ABC1 ABC2

The tricky part here, is the absence os a table-generating function to split the comma delimited value to multiple rows.

Redshift get Word patterns from a String

Answers (1)

Related Questions