Ankit Mongia
Ankit Mongia

Reputation: 210

regex to convert alphanumeric and special characters in a string to * in oracle

I have a requirement to convert all the characters in my string to *. My string can also contain special characters as well.

For Example:
abc_d$ should be converted to ******.

Can any body help me with regex like this in oracle.

Thanks

Upvotes: 0

Views: 1040

Answers (5)

user5683823
user5683823

Reputation:

THIS IS NOT AN ANSWER

As suggested by Tom Biegeleisen’s brother Tim, I ran a test to compare a solution based on regular expressions to one using just standard string functions. (Specifically, Tim's answer with regular expressions vs. Patrick Artner's solution using just LENGTH and RPAD.)

Details of the test are shown below.

CONCLUSION: On a table with 5 million rows, each consisting of one string of length 30 (in a single column), the regular expression query runs in 21 seconds. The query using LENGTH and RPAD runs in one second. Both solutions read all the data from the table; the only difference is the function used in the SELECT clause. As noted already, both queries have the same execution plan, AND the same estimated cost - because the cost does not take into account differences in function calculation time.

Setup:

create table tbl ( str varchar2(30) );
insert into tbl
  select a.str
  from   ( select dbms_random.string('p', 30) as str
           from   dual
           connect by level <= 100
         ) a
         cross join
         ( select level
           from   dual
           connect by level <= 50000
         ) b
;
commit;

Note that there are only 100 distinct values, and each is repeated 50,000 times for a total of 5 million values. We know the values are repeated; Oracle doesn't know that. It will really do "the same thing" 5 million times, it won't just do it 100 times and then simply copy the results; it's not that smart. This is something that would be known only by seeing the actual stored data, it's not known to Oracle beforehand, so it can't "prepare" for such shortcuts.

Queries:

The two queries - note that I didn't want to send 5 million rows to screen, nor did I want to populate another table with the "masked" values (and muddy the waters with the time it takes to INSERT the results into another table); rather, I compute all the new strings and take the MAX. Again, in this test all "new" strings are equal to each other - they are all strings of 30 asterisks - but there is no way for Oracle to know that. It really has to compute all 5 million new strings and take the max over them all.

select max(new_str)
from   ( select regexp_replace(str, '.', '*' ) as new_str
         from   tbl
       )
;

select max(new_str)
from   ( select rpad('*', length(str), '*') as new_str
         from   tbl
       )
;

Upvotes: 1

balaiah
balaiah

Reputation: 24

select name,lpad(regexp_replace(name,name,'*'),length(name),'*')
from customer;

Upvotes: 0

Patrick Artner
Patrick Artner

Reputation: 51653

Instead of regex you could also use

select rpad('*', length('abc_d$ s'),'*') from dual  
-- use '*' and pad it until length fits with other *

Doku: rpad(string,length,appendWhat)

Repeat with a string of '*' should work as well: repeat(string,count) (not tested)

regex or rpad makes no difference - they are optimized down to the same execution plan:

n-th try of rpad:

 Plan Hash Value  : 1388734953 

-----------------------------------------------------------------
| Id | Operation        | Name | Rows | Bytes | Cost | Time     |
-----------------------------------------------------------------
|  0 | SELECT STATEMENT |      |    1 |       |    2 | 00:00:01 |
|  1 |   FAST DUAL      |      |    1 |       |    2 | 00:00:01 |
-----------------------------------------------------------------

n-th try of regex_replace

 Plan Hash Value  : 1388734953 

-----------------------------------------------------------------
| Id | Operation        | Name | Rows | Bytes | Cost | Time     |
-----------------------------------------------------------------
|  0 | SELECT STATEMENT |      |    1 |       |    2 | 00:00:01 |
|  1 |   FAST DUAL      |      |    1 |       |    2 | 00:00:01 |
-----------------------------------------------------------------

So it does not matter wich u use.

Upvotes: 1

codeLover
codeLover

Reputation: 2592

Try this:

SELECT
  REGEXP_REPLACE('B^%2',
                 '*([A-Z]|[a-z]|[0-9]|[ ]|([^A-Z]|[^a-z]|[^0-9]|[^ ]))', '*') "REGEXP_REPLACE"
  FROM DUAL;

I have included for white spaces too

Upvotes: 0

Tim Biegeleisen
Tim Biegeleisen

Reputation: 521249

Use REGEXP_REPLACE and replace any single character (.) with *.

SELECT
    REGEXP_REPLACE (col, '.', '*')
FROM yourTable

Demo

Upvotes: 3

Related Questions