Reputation: 41
I have a working script:
Select col from table where regexp_like (col,'^noun[ |s |es ]| noun[ |s |es ]|noun[ |s |es ]$','i');
Can I optimize my three blocks in REGEXP to a shorter form?
Good:
noun abcd
nouns abcd
abcd noun abcd
abcd nounes abcd
abcd noun
Wrong:
nounse abcd
abcd anouns abcd
abcd inoun
Upvotes: 1
Views: 218
Reputation: 29647
In most regex engines a word boundary \b
can be used to get the seperate words.
But in Oracle regex you need to do that differently.
(^|\s)noun(e?s)?(\s|$)
(^|\s) : start of string or whitespace
(e?s)? : optional group that has 'es' or 's'
(\s|$) : whitespace or end of string
Setting up test data:
create table test_table (id number(8,0), col varchar2(30), matchexpected char(1));
insert into test_table (id, col, matchexpected) values (1,'noun abcd','Y');
insert into test_table (id, col, matchexpected) values (2,'nouns abcd','Y');
insert into test_table (id, col, matchexpected) values (3,'abcd NOUN abcd','Y');
insert into test_table (id, col, matchexpected) values (4,'abcd nounEs abcd','Y');
insert into test_table (id, col, matchexpected) values (5,'abcd noun','Y');
insert into test_table (id, col, matchexpected) values (6,'nounse abcd','N');
insert into test_table (id, col, matchexpected) values (7,'abcd anouns abcd','N');
insert into test_table (id, col, matchexpected) values (8,'abcd inoun','N');
Example Query:
select *
from test_table
where regexp_like (col,'(^|\s)noun(e?s)?(\s|$)','i');
Or use \W
(non-word character: [^A-Za-z0-9_]
) in the regex. Instead of \s
(whitespace). To match also strings like "abc nounes!".
select *
from test_table
where regexp_like (col,'(^|\W)noun(e?s)?(\W|$)','i');
Result:
The first 5 id's.
Upvotes: 3