Reputation: 2535
System.out.println("du hast mich".replaceAll("(?<=^(.*)) ", ", $1 "));
// prints "du, du hast, du hast mich"
what is the mean of ^
symbol after the look behind ? (I know standard mean of this symbol is start of the line) and why dot symbol matches up to du then du hast then du hast mich.In briesf why the dot symbol didn't match the whole string?
Please give me an explanation how this regex works properly.I am wondering.Thanks for your interest.
Upvotes: 4
Views: 160
Reputation: 75222
That regex shouldn't work at all. What it should do is throw an exception because of the open-ended quantifier (.*)
in the lookbehind. You seem to have discovered a glitch that lets you bypass that rule. But don't use it! It's definitely a bug, not a feature.
Java's lookbehinds have always been a little twitchy, which I attribute to its complicated known maximum length requirement for lookbehind subexpressions. I've come to feel that feature was a mistake; it's just not useful to enough to justify the hassles it brought with it. This is why I try to avoid using any quantifiers in my lookbehinds.
Upvotes: 0
Reputation: 279910
Kendall has the explanation. Here's the step by step.
du hast mich
^ regex hasn't matched anything so no replacement
writes
du
Next
du hast mich
^ regex matches
replaces the match with a comma and everything before the space
, du
Next
du hast mich
^ no match
writes
hast
Next
du hast mich
^ regex matches
replaces that match with a comma and everything before the space
, du hast
Next
du hast mich
^ no match
leaves it as is
mich
combine all that and you get
du, du hast, du hast mich
Upvotes: 2
Reputation: 44316
(?<= )
is the syntax for lookbehind. The ^
is just the "start of string" anchor. Essentially what the regex is saying is:
"Match a space which is preceded by the start of the string and any number of characters. The characters preceding the space are the first captured group."
Upvotes: 3