user2441441
user2441441

Reputation: 1387

Hadoop - Pipe delimiter not recognized

I want to split a file with a pipe character on a string like number|twitter|abc.. in the mapper. It is a long string. But it doesn't recognize pipe delimiter when I do:

String[] columnArray = line.split("|");

If I try to split it with a space like line.split(" "), it works fine so I don't think there is a problem with it recognizing characters. Is there any other character that can look like pipe? Why doesn't split recognize the | character?

Upvotes: 0

Views: 626

Answers (2)

JBuenoJr
JBuenoJr

Reputation: 975

As shared in another answer "String.split expects a regular expression argument. An unescaped | is parsed as a regex meaning "empty string or empty string," which isn't what you mean." https://stackoverflow.com/a/9808719/2623158

Here's a test example.

public class Test
{
   public static void main(String[] args)
   {
      String str = "test|pipe|delimeter";
      String [] tmpAr = str.split("\\|");

      for(String s : tmpAr)
      {
         System.out.println(s);
      }
   }
}

Upvotes: 1

jtahlborn
jtahlborn

Reputation: 53694

String.split takes a regular expression (as the javadoc states), and "|" is a special character in regular expressions. try "[|]" instead.

Upvotes: 0

Related Questions