Vignesh
Vignesh

Reputation: 364

Oracle regex to find the special character in name field

I'm trying to filter out the names which have special characters.

Requirement:

1) Filter the names which have characters other than a-zA-Z , space and forward slash(/).

Regex being tried out:

1) regexp_like (customername,'[^a-zA-Z[:space:]\/]'))
2) regexp_like (customername,'[^a-zA-Z \/]'))

The above two regex helps in finding the names with special characters like ? and dot(.)

For example:

LEAL/JO?O

FRANCO/DIVALDO Sr.

But I couldn't figure out why some names(listed below) with the allowed characters(a-zA-Z , space and forward slash(/)) also get retrieved.

For example:

ESTEVES/MARIA INES

PEREZ/JOSE

DUTRA SILVA/LIGIA

Please help to figure out the mistake in the regex being used.

Many thanks in advance!

Upvotes: 1

Views: 32980

Answers (2)

Beege
Beege

Reputation: 665

@gary_w has most of the bases well covered....

Here's my sql version of unix: cat -vet MyFile

select replace(regexp_replace(my_column,'[^[:print:]]', '!ACK!'),' ','.') as CAT_VET
from my_table

... all the non-printing characters become !ACK! and spaces become . You still need to determine what the characters actually ARE, but it's useful to find the looney-toon characters in your data.

Also, select dump(my_column) ... is another way to view the raw column values.

Upvotes: 0

Gary_W
Gary_W

Reputation: 10360

Your regex #1 worked for me on 11g with the name data copied/pasted from this page. I wonder if you have non-printable control characters in the data? Try adding [:cntrl:] to the regex to catch control characters. P.S. the backslash is not needed before the slash when inside of a character class (square brackets).

SQL> with tbl(name) as (
      select 'LEAL/JO?O'          from dual union
      select 'FRANCO/DIVALDO Sr.' from dual union
      select 'ESTEVES/MARIA INES' from dual union
      select 'PEREZ/JOSE'         from dual union
      select 'DUTRA SILVA/LIGIA'  from dual
    )
    select *
    from tbl
    where regexp_like(name, '[^a-zA-Z[:space:][:cntrl:]/]');

NAME
------------------
FRANCO/DIVALDO Sr.
LEAL/JO?O

SQL>

If you can copy/paste this, run it and get the same results, then something is up with the data in your table. Have a look at the data in HEX which will bring to light a previously hidden character perhaps. Here's a simple example which shows the name "JOSE" in HEX. Using one of the numerous ASCII charts out there like http://www.asciitable.com/ you can see there are no hidden characters:

SQL> select 'JOSE' as chr, rawtohex('JOSE') as hex from dual;

CHR  HEX
---- --------
JOSE 4A4F5345

SQL>

So, have a look at a name or two and see if you have any hidden characters. If not, I suspect a conflicting characterset issue maybe.

Upvotes: 2

Related Questions