Jamie J
Jamie J

Reputation: 1305

Use postgres regexp_replace() to replace integers in a string

I have a redshift database that I need to query, and need to group similar strings together. I'm using regexp_replace() to do this, but can't work out how to get it grouping strings that have integers in the middle. For example:

dataset:

string
'aaa/123/bbb'
'aaa/456/bbb'
'ccc/123/ddd'

I need to group this so that we'd get

string     count(*)
aaa/id/bbb 2
ccc/id/ddd 1

So I have tried using

regexp_replace(endpoint, '/[0-9]+$/', '/id/')

But it doesn't work, I'm assuming because there is no wildcard or something? But I can't work out how to fix this.

Thanks in advance

Upvotes: 0

Views: 889

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1269603

I know that you also want to replace the numbers at the end. This comes close to what you want:

select regexp_replace(endpoint, '/[0-9]+(/|$)', '/id/')
from (select 'aaa/123/bbb' as endpoint union all
      select 'aaa/123' as endpoint 
      ) x

But it returns a slash at the end in the second case.

If you have no other intermediate values that start with numbers, then this does what you want:

select regexp_replace(endpoint, '/[0-9]+', '/id')
from (select 'aaa/123/bbb' as endpoint union all
      select 'aaa/123' as endpoint 
      ) x

Otherwise, two calls to regexp_replace() does the trick:

select regexp_replace(regexp_replace(endpoint, '/[0-9]+/', '/id/'), '/[0-9]$', '/id')
from (select 'aaa/123/bbb' as endpoint union all
      select 'aaa/123' as endpoint 
      ) x;

Upvotes: 1

Related Questions