Dan W
Dan W

Reputation: 5782

Split String By Character

I have a case in which I'm doing the following:

final String[] columns = row.split(delimiter.toString());

Where delimiter is a Character.

This works fine when I need to split based on tabs by providing \t as the delimiter. However, when I want to split on a pipe, I pass in a delimiter of | and this does not work as expected.

I've read several posts about how | is a special character which means null or empty therefore it splits on every character it encounters, though, I don't want this behavior.

I could do a simple check in my code for this pipe case and get around the issue:

if ("|".equals(delimiter.toString())) {
    columns = row.split("\\" + delimiter.toString());
}
else {
    columns = row.split(delimiter.toString());
} 

But I didn't know if there was an easier way to get around this. Also, are there any other special characters that act like the | does that I need to take into account?

Upvotes: 6

Views: 724

Answers (2)

wchargin
wchargin

Reputation: 16057

Try:

import java.util.regex.Pattern;

...

final String[] columns = row.split(Pattern.quote(delimiter.toString()));

With regards to the other metacharacters, as they're called, here's a quote from the String Literals tutorial:

This API also supports a number of special characters that affect the way a pattern is matched.

...

The metacharacters supported by this API are: <([{\^-=$!|]})?*+.>

See:

Upvotes: 18

Adam Siemion
Adam Siemion

Reputation: 16039

  1. You can use StringUtils from Apache Commons Lang which is equipped with methods accepting plain text, not regular expressions:

    public static String[] split(String str, char separatorChar)
    public static String[] split(String str, String separatorChars)
    
  2. You can also use the StringTokenzier class, which does not expect a regular expression as the delimiter.

Upvotes: 4

Related Questions