Michelle
Michelle

Reputation: 111

Postgresql - How do I extract the first occurence of a substring in a string using a regular expression pattern?

I am trying to extract a substring from a text column using a regular expression, but in some cases, there are multiple instances of that substring in the string.

In those cases, I am finding that the query does not return the first occurrence of the substring. Does anyone know what I am doing wrong?

For example:

If I have this data:

create table data1
(full_text text, name text);

insert into data1 (full_text)
values ('I 56, donkey, moon, I 92')

I am using

UPDATE data1
SET name = substring(full_text from '%#"I ([0-9]{1,3})#"%' for '#')

and I want to get 'I 56' not 'I 92'

Upvotes: 11

Views: 15121

Answers (3)

Guasqueño
Guasqueño

Reputation: 447

Try the following expression. It will return the first occurrence:

SUBSTRING(full_text, 'I [0-9]{1,3}')

Upvotes: 2

jian
jian

Reputation: 4824

You can use regexp_match() In PostgreSQL 10+

select regexp_match('I 56, donkey, moon, I 92', 'I [0-9]{1,3}');

Quote from documentation:

In most cases regexp_matches() should be used with the g flag, since if you only want the first match, it's easier and more efficient to use regexp_match(). However, regexp_match() only exists in PostgreSQL version 10 and up. When working in older versions, a common trick is to place a regexp_matches() call in a sub-select...

Upvotes: 0

user330315
user330315

Reputation:

You can use regexp_matches() instead:

update data1
  set full_text = (regexp_matches(full_text, 'I [0-9]{1,3}'))[1];

As no additional flag is passed, regexp_matches() only returns the first match - but it returns an array so you need to pick the first (and only) element from the result (that's the [1] part)

It is probably a good idea to limit the update to only rows that would match the regex in the first place:

update data1
  set full_text = (regexp_matches(full_text, 'I [0-9]{1,3}'))[1]
where full_text ~ 'I [0-9]{1,3}'

Upvotes: 15

Related Questions