Alex
Alex

Reputation: 12943

PostgreSQL regular expression capture group in select

How can the matched regular expression be returned from an SQL select? I tried using REGEXP_EXTRACT with no luck (function not available). What I've done that does work is this:

SELECT column ~ '^stuff.*$'
FROM table;

but this gives me a list of true / false. I want to know what is extracted in each case.

Upvotes: 37

Views: 47325

Answers (3)

Alex
Alex

Reputation: 12943

I'm using Amazon Redshift which uses PostgreSQL 8.0.2 (I should have mentioned this in the question). For me what worked was REGEXP_SUBSTR

e.g.

SELECT REGEXP_SUBSTR(column,'^stuff.*$')
FROM table

Upvotes: 5

Hambone
Hambone

Reputation: 16407

If you're trying to capture the regex match that resulted from the expression, then substring would do the trick:

select substring ('I have a dog', 'd[aeiou]g')

Would return any match, in this case "dog."

I think the missing link of what you were trying above was that you need to put the expression you want to capture in parentheses. regexp_matches would work in this case (had you included parentheses around the expression you wanted to capture), but would return an array of text with each match. If it's one match, substring is sort of convenient.

So, circling back to your example, if you're trying to return stuff if and only if it's at the beginning of a column:

select substring (column, '^(stuff)')

or

select (regexp_matches (column, '^(stuff)'))[1]

Upvotes: 53

Vamsi Prabhala
Vamsi Prabhala

Reputation: 49270

Use regexp_matches.

SELECT regexp_matches(column,'^stuff.*$')
FROM table

The regexp_matches function returns a text array of all of the captured substrings resulting from matching a POSIX regular expression pattern. It has the syntax regexp_matches(string, pattern [, flags ]). The function can return no rows, one row, or multiple rows (see the g flag below). If the pattern does not match, the function returns no rows. If the pattern contains no parenthesized subexpressions, then each row returned is a single-element text array containing the substring matching the whole pattern. If the pattern contains parenthesized subexpressions, the function returns a text array whose n'th element is the substring matching the n'th parenthesized subexpression of the pattern (not counting "non-capturing" parentheses; see below for details). The flags parameter is an optional text string containing zero or more single-letter flags that change the function's behavior. Flag g causes the function to find each match in the string, not only the first one, and return a row for each such match.

Upvotes: 10

Related Questions