Reputation: 111
I am trying to extract a substring from a text column using a regular expression, but in some cases, there are multiple instances of that substring in the string.
In those cases, I am finding that the query does not return the first occurrence of the substring. Does anyone know what I am doing wrong?
For example:
If I have this data:
create table data1
(full_text text, name text);
insert into data1 (full_text)
values ('I 56, donkey, moon, I 92')
I am using
UPDATE data1
SET name = substring(full_text from '%#"I ([0-9]{1,3})#"%' for '#')
and I want to get 'I 56'
not 'I 92'
Upvotes: 11
Views: 15121
Reputation: 447
Try the following expression. It will return the first occurrence:
SUBSTRING(full_text, 'I [0-9]{1,3}')
Upvotes: 2
Reputation: 4824
You can use regexp_match()
In PostgreSQL 10+
select regexp_match('I 56, donkey, moon, I 92', 'I [0-9]{1,3}');
Quote from documentation:
In most cases regexp_matches() should be used with the g flag, since if you only want the first match, it's easier and more efficient to use regexp_match(). However, regexp_match() only exists in PostgreSQL version 10 and up. When working in older versions, a common trick is to place a regexp_matches() call in a sub-select...
Upvotes: 0
Reputation:
You can use regexp_matches()
instead:
update data1
set full_text = (regexp_matches(full_text, 'I [0-9]{1,3}'))[1];
As no additional flag is passed, regexp_matches()
only returns the first match - but it returns an array so you need to pick the first (and only) element from the result (that's the [1]
part)
It is probably a good idea to limit the update to only rows that would match the regex in the first place:
update data1
set full_text = (regexp_matches(full_text, 'I [0-9]{1,3}'))[1]
where full_text ~ 'I [0-9]{1,3}'
Upvotes: 15