user1274399
user1274399

Reputation: 115

Java Pattern split considering capturing groups

I need to split a string passing a regex, but the split token is just the group 1 of the regex. An example:

Original String = "paulo\\;Is\\;In;Real;Doubt"

Array formed using the split = ["paulo\\;Is\\;In", "Real", "Doubt"]

My first idea was to use as regex: [^\\\\][;] but it obviously did not work.

The output was: ["paulo\\;Is\\;I", "Rea", "Doubt"] (I am using the String.split() method.)

My second idea was to use the ; as a group: [^\\\\]([;]), but I just can't tell the split method to just consider the group(1) as a split token.

Upvotes: 2

Views: 2713

Answers (4)

user557597
user557597

Reputation:

If escapes can escape anything, you would be better off just finding all with a straight global regex, with a single capture group.

Raw regex:

(?:(?<=;)|(?<=^))([^;\\]*(?:\\.[^;\\]*)*)(?:;|$)

expanded:

(?:                              // prevent (mitigate) re-capture of last field
   (?<=;)
 | (?<=^)
)

( [^;\\]* (?:\\.[^;\\]*)* )      // Capture the field, grp 1 (can be blank)

(?:                              // The delimiter or end o string                             
   ;
 |
   $
)

Upvotes: 0

wds
wds

Reputation: 32283

Your question is hard to answer because it is entirely unclear. You say your split token is "just the group 1 of the regex". Group 1 of what regex?

EDIT: Still hard to answer, why don't you clarify?

Anyway, if what you want is "split on ';', but only when it's not escaped with a '\'", then you can use negative lookbehind to get what you want.

Example:

"paulo\\;Is\\;In;Real;Doubt".split("(?<!\\\\);");

gives

[ "paulo\;Is\;In", "Real", "Doubt" ]

Upvotes: 3

Chetter Hummin
Chetter Hummin

Reputation: 6817

Well, I had to modify your input a little bit as I got errors when

String x = "paulo\\;Is\\;In;Real;Doubt";
String[] res = x.split("\\\\;");

Upvotes: 0

assylias
assylias

Reputation: 328598

The problem is that the expression below is true:

("\;").equals(";")

So your original string is equal to:

"paulo;Is;In;Real;Doubt"

Upvotes: 0

Related Questions