Incognito
Incognito

Reputation: 133

SQL - Update String in Colum with regex

Using SQL, I am trying to update substring in a column with regexp.

That's my goal :

or an other content with multiple links:

  <p>BlaBlaBla</p>
  <a href="http://blabla.fr/blibli">Link1</a>
  <a href="http://blabla.fr/">Link2</a>
  <a href="http://blabla.fr/test">Link3 </a>

(This example represente an entire string, not many result of select ) - I want to remove 'http://' in urls which not contains 'blibli'

So I did this query

 UPDATE my_table SET my_col = replace(my_col,'http://','') where not regexp_like(my_col,'http://^((?blibli).)*$')

But this work only if the column contains at least one link like my first example. If the column contains many link, it will update nothing due to the conditions. (Maybe doing this with many step? ) I tried to have a look to REGEXP_SUBSTR but i think i can't achieve this with it too. The desired result for the second example is :

<p>BlaBlaBla</p>
<a href="http://blabla.fr/blibli">Link1</a>
<a href="blabla.fr/">Link2</a>
<a href="blabla.fr/test">Link3 </a>

Thanks for any advice you can give me

Upvotes: 3

Views: 5496

Answers (3)

Alex Poole
Alex Poole

Reputation: 191245

Similar to Trung's approach, but capturing both (either) matched group:

update my_table set my_col = regexp_replace(my_col,
  'href="(http://[^"]*blibli[^"]*)"|href="http://([^"]*)"', 'href="\1\2"')

optionally with a where clause to avoid updating rows that don't have an href, say.

Quick demo of how sample values (supplied in a CTE) are translated:

with my_table (id, my_col) as (
  select 1, '<a href="http://blabla.fr/blibli">Link1</a>' from dual
  union all select 2, '<a href="http://blabla.fr/blibli/abc">Link1</a>' from dual
  union all select 3, '<a href="http://blabla.fr/xbliblix">Link1</a>' from dual
  union all select 4, '<a href="http://blabla.fr/test">Link1</a>' from dual
  union all select 5, '<p>BlaBlaBla</p>
<a href="http://blabla.fr/blibli">Link1</a>
<a href="http://blabla.fr/">Link2</a>
<a href="http://blabla.fr/test">Link3 </a>' from dual
)
select id, my_col, regexp_replace(my_col,
    'href="(http://[^"]*blibli[^"]*)"|href="http://([^"]*)"',
    'href="\1\2"') as result
from my_table;

        ID MY_COL                                           RESULT                                          
---------- ------------------------------------------------ ------------------------------------------------
         1 <a href="http://blabla.fr/blibli">Link1</a>      <a href="http://blabla.fr/blibli">Link1</a>     
         2 <a href="http://blabla.fr/blibli/abc">Link1</a>  <a href="http://blabla.fr/blibli/abc">Link1</a> 
         3 <a href="http://blabla.fr/xbliblix">Link1</a>    <a href="http://blabla.fr/xbliblix">Link1</a>   
         4 <a href="http://blabla.fr/test">Link1</a>        <a href="blabla.fr/test">Link1</a>              
         5 <p>BlaBlaBla</p>                                 <p>BlaBlaBla</p>                                
           <a href="http://blabla.fr/blibli">Link1</a>      <a href="http://blabla.fr/blibli">Link1</a>     
           <a href="http://blabla.fr/">Link2</a>            <a href="blabla.fr/">Link2</a>                  
           <a href="http://blabla.fr/test">Link3 </a>       <a href="blabla.fr/test">Link3 </a>             

Upvotes: 2

Trung Duong
Trung Duong

Reputation: 3475

You could use REGEXP_REPLACE, with grouping

UPDATE my_table SET my_col= REGEXP_REPLACE(my_col, 'href="(http://.*blibli[^"]*)|href="http://([^"])', 'href="\1')

You could check the sample at this link http://rextester.com/HEMG60862

Upvotes: 4

Onur Cete
Onur Cete

Reputation: 273

You said "I want to remove 'http://' in urls which not contains 'blibli'"

Why do not you use simple regex like that?

 UPDATE my_table SET my_col= replace(my_col,'http://','') where not regexp_like(my_col,'*blibli*')

Upvotes: 1

Related Questions