artemis
artemis

Reputation: 7281

Oracle SQL - Regular Expression matching using REGEXP_REPLACE()

Good morning,

I am hoping to find assistance with writing a select query to remove some text from a column.

I have created a column called "TEXT_MINING" in a previous query, that some code a different developer wrote will perform some text mining analysis on. The TEXT_MINING column has text that looks like:

EMPLOYEE FOUND BROKEN HANDLE ON HAMMER * 02-08-18 15:19:22 PM * I found a hammer that had the wood split on the handle, tossed into scrap.

I want to remove the * and all of the text in between the two * to help my software engineer do some text mining. Here is my current dilemma:

Not only do I not know how to use REGEXP_REPLACE, but I can't get the REGEXP worked out. I currently have:

^[*]\w[*]$

So it looks like:

REGEXP_REPLACE(col, '^[*]\w[*]$', '')

Could anyone advise?

Thank you!

Upvotes: 1

Views: 560

Answers (2)

Aleksej
Aleksej

Reputation: 22969

This could be a way:

select regexp_replace(yourString, '\*.*\*', '') from yourTable

Please notice that this will remove everything between the first and the last '*' in the string; for example:

with test(x) as (
select 'Something * something else * and a * just before another * and something more' from dual
)
select regexp_replace(x, '\*.*\*', '') from test

gives:

Something  and something more

Upvotes: 2

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627469

You may use this approach to remove 1+ occurrences of *...* substrings in your column:

SELECT REGEXP_REPLACE(
   'EMPLOYEE FOUND BROKEN HANDLE ON HAMMER * 02-08-18 15:19:22 PM * I found a hammer that had the wood split on the handle, tossed into scrap.', 
   '\s*\*[^*]*\*', 
   ''
) as Result from dual

See the online demo

Pattern details

  • \s* - 0+ whitespaces
  • \* - a * char
  • [^*]* - 0+ chars other than *
  • \* - a * char.

See the regex demo.

enter image description here

Upvotes: 3

Related Questions