Stephan K.
Stephan K.

Reputation: 15702

RegEx with Pipes and IPs not working

The RegEx:

^([0-9\.]+)\Q|\E([^\Q|\E])\Q|\E

does not match the string:

1203730263.912|12.66.18.0|

Why?

Upvotes: 0

Views: 219

Answers (2)

hwnd
hwnd

Reputation: 70722

The accepted answer seems somewhat incorrect so I wanted to address this for future readers.

If you did not already know, using \Q and \E ensures that any character between \Q ... \E will be matched literally, not interpreted as a metacharacter by the regular expression engine.

First and most important, \Q and \E is NOT usable within a bracketed character class [].

[^\Q|\E]  # Incorrect
[^|]      # Correct

Secondly, you do not follow that class with a quantifier. Using this, the correct syntax would be:

^([0-9.]+)\Q|\E([^|]+)\Q|\E

Although, it is much simpler to write this out as:

^([0-9.]+)\|([^|]+)\|

Upvotes: 1

Avinash Raj
Avinash Raj

Reputation: 174696

From PHP docs,

\Q and \E can be used to ignore regexp metacharacters in the pattern.

For example:

\w+\Q.$.\E$ will match one or more word characters, followed by literals .$. and anchored at the end of the string.

And your regex should be,

^([0-9\.]+)\Q|\E([^\Q|\E]*)\Q|\E

OR

^([0-9\.]+)\Q|\E([^\Q|\E]+)\Q|\E

You forget to add + after [^\Q|\E]. Without +, it matches single character.

DEMO

Explanation:

  • ^ Starting point.
  • ([0-9\.]+) Captures digits or dot one or more times.
  • \Q|\E In PCRE, \Q and \E are referred to as Begin sequence. Which treats any character literally when it's included in that block. So | symbol in that block tells the regex engine to match a literal |.
  • ([^\Q|\E]+) Captures any character not of | one or more times.
  • \Q|\E Matches a literal pipe symbol.

Upvotes: 2

Related Questions