Suryaprakash Pisay
Suryaprakash Pisay

Reputation: 648

Unexpected behavior with Oracle ltrim

select ltrim('PREPROCESSINGLIST_AOD', 'PREPROCESSINGLIST_') as node_code from dual;

The expected result is AOD, but when I run by changing the first string then getting unexpected behavior. Below are the trails I tried and run by changing the character after _ like BOD, COD...ZOD and below is weird result.

When    AOD Result==>   AOD
When    BOD Result==>   BOD
When    COD Result==>   D
When    EOD Result==>   D
When    FOD Result==>   FOD
When    GOD Result==>   D
When    HOD Result==>   HOD
When    IOD Result==>   D
When    JOD Result==>   JOD
When    KOD Result==>   KOD
When    LOD Result==>   D
When    MOD Result==>   MOD
When    NOD Result==>   D
When    OOD Result==>   D
When    POD Result==>   D
When    QOD Result==>   QOD
When    ROD Result==>   D
When    SOD Result==>   D
When    TOD Result==>   D
When    UOD Result==>   UOD
When    VOD Result==>   VOD
When    WOD Result==>   WOD
When    XOD Result==>   XOD
When    YOD Result==>   YOD
When    ZOD Result==>   ZOD

And... My question why is this weird behavior?

Upvotes: 0

Views: 197

Answers (3)

Alex Poole
Alex Poole

Reputation: 191425

if at all I just want the substring after _, what should be best to use?

It isn't quite clear if you want to only remove the exact string 'PREPROCESSINGLIST_', and if so whether that should only be matched at the start of the string or anywhere; or you want to remove anything up to the first underscore, or anything up to any underscore.

Depending on your actual data and the result you want to get, you can use regexp_replace() as @FrankScmitt showed (with or without an anchor), or a plain replace(), or a combination of instr() and substr().

With some made-up data with various patterns provided in a CTE:

with t (str) as (
            select 'PREPROCESSINGLIST_AOD' from dual
  union all select 'PREPROCESSINGLIST_BOD' from dual
  union all select 'PREPROCESSINGLIST_COD' from dual
  union all select 'PREPROCESSINGLIST_DOD' from dual
  union all select 'XYZ_PREPROCESSINGLIST_EOD' from dual
  union all select 'XYZ_FOD' from dual
  union all select 'ABC_XYZ_GOD' from dual
  union all select 'HOD' from dual
)
select str,
  regexp_replace(str, '^PREPROCESSINGLIST_', null) as anchor_regex,
  regexp_replace(str, 'PREPROCESSINGLIST_', null) as free_regex,
  replace(str, 'PREPROCESSINGLIST_', null) as free_replace,
  case when instr(str, '_') > 0 then substr(str, instr(str, '_') + 1) else str end
    as first_underscore,
  case when instr(str, '_') > 0 then substr(str, instr(str, '_', -1) + 1) else str end
    as last_underscore
from t;

STR                       ANCHOR_REGEX              FREE_REGEX  FREE_REPLAC FIRST_UNDERSCORE      LAST_UNDERS
------------------------- ------------------------- ----------- ----------- --------------------- -----------
PREPROCESSINGLIST_AOD     AOD                       AOD         AOD         AOD                   AOD        
PREPROCESSINGLIST_BOD     BOD                       BOD         BOD         BOD                   BOD        
PREPROCESSINGLIST_COD     COD                       COD         COD         COD                   COD        
PREPROCESSINGLIST_DOD     DOD                       DOD         DOD         DOD                   DOD        
XYZ_PREPROCESSINGLIST_EOD XYZ_PREPROCESSINGLIST_EOD XYZ_EOD     XYZ_EOD     PREPROCESSINGLIST_EOD EOD        
XYZ_FOD                   XYZ_FOD                   XYZ_FOD     XYZ_FOD     FOD                   FOD        
ABC_XYZ_GOD               ABC_XYZ_GOD               ABC_XYZ_GOD ABC_XYZ_GOD XYZ_GOD               GOD        
HOD                       HOD                       HOD         HOD         HOD                   HOD        

If you can get the result you need in more than one way then it is generally more efficient to avoid regular expressions, but sometimes they are the only (sane) choice. As always it's best to test the options yourself against your actual data to see what is most efficient - or at least efficient enough.

Upvotes: 3

Lasse V. Karlsen
Lasse V. Karlsen

Reputation: 391496

It isn't weird behavior. It is documented behavior.

According to the documentation of LTRIM:

LTRIM removes from the left end of char all of the characters contained in set.

So the second parameter, though in a string type, is not a string as such, it is a set of characters to trim away.

So this:

ltrim('PREPROCESSINGLIST_COD', 'PREPROCESSINGLIST_')

will end up returning D because both C and O is in PREPROCESSINGLIST_:

PREPROCESSINGLIST_
     ^^
     here

_AOD, however, A is not in the set so the trimming stops there.

If you test you can see that you get exactly the same behavior from this:

ltrim('PREPROCESSINGLIST_COD', 'CEGILNOPRST_')

This should be the same characters as in PREPROCESSINGLIST_.

Upvotes: 4

Frank Schmitt
Frank Schmitt

Reputation: 30815

The question why your code behaves this way has already been answered by Lasse Vågsæther Karlsen. To achieve what you want, you can use a regular expression:

select regexp_replace('PREPROCESSINGLIST_COD', 
                      '^PREPROCESSINGLIST_', 
                      '') as node_code 
from dual;

This will remove a leading PREPROCESSINGLIST from the string, but leave it there if it's in the middle of the string (because of the ^ anchor for beginning-of-string).

Upvotes: 3

Related Questions