HatsOn
HatsOn

Reputation: 67

Split string by array of characters

i want to split a string by array of characters, so i have this code:

String target = "hello,any|body here?";
char[] delim = {'|',',',' '};
String regex = "(" + new String(delim).replaceAll("(.)", "\\\\$1|").replaceAll("\\|$", ")");
String[] result = target.split(regex);

everything works fine except when i want to add a character like 'Q' to delim[] array, it throws exception :

java.util.regex.PatternSyntaxException: Illegal/unsupported escape sequence near index 11
(\ |\,|\||\Q)

so how can i fix that to work with non-special characters as well?

thanks in advance

Upvotes: 1

Views: 357

Answers (3)

Bernhard Barker
Bernhard Barker

Reputation: 55589

Using Pattern.quote and putting it in square brackets seems to work:

String regex = "[" + Pattern.quote(new String(delim)) + "]";

Tested with possible problem characters.

Upvotes: 1

Sergey Kalinichenko
Sergey Kalinichenko

Reputation: 726479

how can i fix that to work with non-special characters as well

Put square brackets around your characters, instead of escaping them. Make sure that if ^ is included in your list of characters, you need to make sure it's not the first character, or escape it separately if it's the only character on the list.

Dashes also need special treatment - they need to go at the beginning or at the end of the regex.

String delimStr = String(delim);
String regex;
if (delimStr.equals("^") {
    regex = "\\^"
} else if (delimStr.charAt(0) == '^') {
    // This assumes that all characters are distinct.
    // You may need a stricter check to make this work in general case.
    regex = "[" + delimStr.charAt(1) + delimStr + "]";
} else {
    regex = "[" + delimStr + "]";
}

Upvotes: 2

SJuan76
SJuan76

Reputation: 24780

Q is not a control character in a regex, so you do not have to put the \\ before it (it only serves to mark that you must interpret the following character as a literal, and not as a control character).

Example

`\\.` in a regex means "a dot"

`.` in a regex means "any character"

\\Q fails because Q is not special character in a regex, so it does not need to be quoted.

I would make delim a String array and add the quotes to these values that need it.

 delim = {"\\|", ..... "Q"};

Upvotes: 0

Related Questions