Oz.
Oz.

Reputation: 97

In Java how to match a character that is not a "["?

I have a block of text that has information encoded as follows:

[tag 1] some text [tag 2] more text [tag 3] even more text 

I am in the process of creating a regular expression in Java that will extract encoded information into separate strings. Such as:

[tag 1] some text
[tag 2] more text
[tag 3] even more text

The regular expression that I have created is (for regular pattern matching): “([.+?][^[]+)”

This regular expression works well in Notepad++ and two online-tools:

  1. http://www.regextester.com/
  2. http://www.softlion.com/webTools/RegExpTest/default.aspx

In Java this regular expression statement produces a runtime exception:

Pattern pattern = Pattern.compile(“(\\[.+?\\][^[]+)”);

Exception details:

Exception in thread "main" java.util.regex.PatternSyntaxException: Unclosed character class near index 13
(\[.+?\][^[]+)
             ^

Do I have to escape the negated "[" within the character class? If yes how?

Upvotes: 2

Views: 151

Answers (4)

Kendall Frey
Kendall Frey

Reputation: 44376

The Java implementation seems to have a bug.

Normally, regex does not require you to escape it, but try escaping it anyway.

(\[.+?\][^\[]+)
"(\\[.+?\\][^\\[]+)"

It could be considered good practice to escape special characters, even if they don't need to be. It also helps avoid bugs like this.

Upvotes: 0

kgautron
kgautron

Reputation: 8273

You need to excape the bracket, this should work :

[^\\[]

Upvotes: 0

Andrew Clark
Andrew Clark

Reputation: 208545

Escape the [ within the negated character class. Although this shouldn't be necessary inside of a character class, clearly Java is having an issue with it, and it does not change the meaning of the character class to escape characters that shouldn't have a special meaning within a character class.

Try the following:

(\[.+?\][^\[]+)

Or for the Java code:

Pattern pattern = Pattern.compile(“(\\[.+?\\][^\\[]+)”);

Upvotes: 3

Godwin
Godwin

Reputation: 9937

You need to escape the square bracket it just like you escaped them earlier:

(\\[.+?\\][^\\[]+)

The runtime exception is being caused because the RegEx parser sees [^[] as having an unclosed bracket.

Upvotes: 1

Related Questions