XDfox
XDfox

Reputation: 53

Oracle NLS Settings Special Characters Replacement Handling

I found a weird situation in Oracle

Situation Faced

For my further testing, I think this is because the NLS Settings

In German.Germany.AL32UTF8 the substr function unable to grab the correct character especially after the German Special Character. However it is work in User's Session

Besides that, I also had try to use regex_replace but it is also not work for the Oracle Job it unable to replace it.

string:= REGEXP_REPLACE(ps_string, '[äöüßÄÖÜ]', ' ');

Instead of changing on the NLS Settings is there have any other solution for this?

Upvotes: 0

Views: 400

Answers (2)

Wernfried Domscheit
Wernfried Domscheit

Reputation: 59456

I am quite sure "remove special character that length of byte >= 2" is more a dirty hack than a solution of your actual problem.

Anyway, when you set your NLS_LANG properly, i.e. that it matches the character set of your application, then it works well:

SELECT SUBSTR('AÜHLKÖ',3,1)
FROM DUAL;
   
SUBSTR('AÜHLKÖ',3,1)
--------------------
H 

As an alternative try SUBSTRC

If you really need to remove "special characters" - which I doubt - then this function should work:

REGEXP_REPLACE(ASCIISTR('ABÄCöE'), '\\[[:xdigit:]]{4}', ' ')

Upvotes: 1

ORA-01017
ORA-01017

Reputation: 1075

Your problem is different character set in different database.

In my database with NLS Settings = 'AMERICAN.AMERICA.AL32UTF8'

WITH CTE AS (SELECT 'Aöasd' STR FROM DUAL)
SELECT SUBSTR(STR,LEVEL,1) AS NEW_ST, 
       LENGTHB(SUBSTR(STR,LEVEL,1)) AS NEW_ST_LB 
  FROM CTE
CONNECT BY LEVEL <= LENGTH(STR);

N  NEW_ST_LB
- ----------
A          1
ö          2
a          1
s          1
d          1

Now, I tried the same query in my other DB with German character set and result is 1 for all characters. (WE8ISO8859P15)

N  NEW_ST_LB
- ----------
A          1
ö          1 --<--- See this
a          1
s          1
d          1

The thing is in WE8ISO8859P15 character set, It takes only 1 byte to store it rather then 2 Bytes.

Solution:

Use the CONVERT function to convert each character to UTF8 and then check the length of the character in terms of byte and replace it if it is greater than 1.

Query:

WITH CTE AS (SELECT 'Aöasd' STR FROM DUAL)
SELECT CONVERT(SUBSTR(STR,LEVEL,1), 'UTF8' ) AS NEW_ST, 
       LENGTHB( CONVERT(SUBSTR(STR,LEVEL,1), 'UTF8' )) AS NEW_ST_LB 
  FROM CTE
CONNECT BY LEVEL <= LENGTH(STR);

Result in DB with NLS Settings = 'AMERICAN.AMERICA.AL32UTF8'

NEW_ST                NEW_ST_LB
-------------------- ----------
A                             1
ö                             2
a                             1
s                             1
d                             1

Result in DB with NLS Settings = 'German.Germany.AL32UTF8'

NEW_S  NEW_ST_LB
----- ----------
A              1
ö             2
a              1
s              1
d              1

Upvotes: 1

Related Questions