Ogen
Ogen

Reputation: 6709

Regex group is matching quotes when I don't want it to

I have this regular expression:

"([^"\\]|\\.)*"|(\S+)

Regular expression visualization

Debuggex Demo

But the problem is, when I have an input like "foo" and I use a matcher to go through the groups, the first group it finds is "foo" when I want it to be foo. What am I doing wrong?

EDIT:

I'm using Java and I just fixed it

"((?:[^"\\]|\\.)*)"|(\S+)

Regular expression visualization

Debuggex Demo

The first capturing group wasn't including the * which is the whole string. I enclosed it within a capturing group and made the inner existing one a non capturing group.

EDIT: Actually no... it's working in the online regex debuggers but not in my program...

Upvotes: 2

Views: 61

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626748

Capture the contents of the double quoted literal pattern (Branch 1) and if it matched grab it.

Also, consider unrolling the pattern:

 "([^"\\]*(?:\\.[^\\"]*)*)"|(\S+)

In Java:

String pat = "\"([^\"\\\\]*(?:\\\\.[^\\\\\"]*)*)\"|(\\S+)";

Note that patterns like (A|B)* often cause a stack overflow issue in Java, that's why an unrolled version is preferable.

Upvotes: 1

Related Questions