J-P-Robin
J-P-Robin

Reputation: 310

Exact string coldfusion regular expression

I am using a regular expression to replace all characters that are not equal to the exact word "NULL" and also keep all digits. I did a first step, by replacing all "NULL" words from my string with this :

<cfset data = ReReplaceNoCase("123NjyfjUghfLL|NULL|NULL|NULL","\bNULL\b","","ALL")>

It removes all instances of the exact "NULL" word, that means it does not remove letters "N", "U" and "L" from the substring "123NjyfjUghfLL". And this is correct. But now, I want to reverse that. I want to keep only "NULL" word, meaning that it removes single "L", "U" and "L". So I tried that :

<cfset data = ReReplaceNoCase("123NjyfjUghfLL|NULL|NULL|NULL","[^\bNULL\b]","","ALL")>

But now this keeps all "N", "U" and "L" letters, so it outputs "NULLNULLNULLNULL". There should be only 3 times "NULL".

Can someone help me with this please? And where to add the extra code to keep digits? Thank you.

Upvotes: 3

Views: 525

Answers (1)

Regular Jo
Regular Jo

Reputation: 5510

You can do this

<cfset data = ReReplaceNoCase("123NjyfjUghfLL|NULL|NULL|NULL","(^|\|)(?!NULL(?:$|\|))([^|]*)(?=$|\|)","\1","ALL")>

(^|\|)(?!NULL(?:$|\|))([^|]*)(?=$|\|)

Explanation:

 (                   # Opens Capture Group 1
     ^               # Anchors to the beginning to the string.
 |                   # Alternation (CG1)
     \|              # Literal |
 )                   # Closes CG1
 (?!                 # Opens Negative Lookahead
     NULL            # Literal NULL
     (?:             # Opens Non-Capturing group
         $           # Anchors to the end to the string.
     |               # Alternation (NCG)
         \|          # Literal |
     )               # Closes NCG
 )                   # Closes NLA
 (                   # Opens Capture Group 2
     [^|]*           # Negated Character class (excludes the characters within)
                       # None of: |
                       # * repeats zero or more times
 )                   # Closes CG2
 (?=                 # Opens LA
     $               # Anchors to the end to the string.
 |                   # Alternation (LA)
     \|              # Literal |
 )                   # Closes LA

Regex101.com demo

Lastly, some insight about character classes (content between square brackets)

What [^\bNULL\b] means is

 [^\bNULL\b]     # Negated Character class (excludes the characters within)
                   # None of: \b,N,U,L
                   # When \b is inside a character class, it matches a backspace character.
                   # Outside of a character class, \b matches a word boundary as you use it in your first code.

Character classes are not designed for matching or ignoring words, they're designed for permitting or excluding characters or ranges of characters.

Edit:

Ok so it works well. But what if I would like to keep also the digits? I am a kind of lost in this line of code and I cannot find where to put extra code... I think the extra code would be [^0-9] right?

This regex (demo) works to also permit numbers of any length where the number is the entire value

(^|\|)(?!(?:NULL|[0-9]+)(?:$|\|))([^|]*)(?=$|\|)

You can also use this regex (demo) to permit numbers with a decimal value.

(^|\|)(?!(?:NULL|[0-9]+(?:\.[0-9]+)?)(?:$|\|))([^|]*)(?=$|\|)

Upvotes: 1

Related Questions