Jason
Jason

Reputation: 4130

Regex To Match A Pipe-Delineated File

I need help with a regex to check if a line matches a line of pipe delineated data. The data will end with a pipe, and is not quoted. Some fields will be empty.

Here's what I'm trying to use:

Pattern dataPattern = Pattern.compile("(.+)\\|^");

Here is a sample line of data:

GJ 3486|||121.10766667|-83.23302778|295.84892861999998|-24.832649669999999||-0.48399999999999999||.371|2MASS J08042586-8313589|8.9700000000000006|8.3539999999999992|8.1110000000000007||2MASS||

Since I only wanted to see if the line matched the pattern, I thought the one I came up with would look for "blah blah blah |". Apparently not... can anyone help me out?

Jason

Upvotes: 3

Views: 16631

Answers (5)

Scott Rippey
Scott Rippey

Reputation: 15810

It looks like you're using ^ at the end of the line, but you should be using $ instead.

"(.+)\\|$"

Upvotes: 0

Fred
Fred

Reputation: 5006

Pattern dataPattern = Pattern.compile("^([^\\|]*\\|)+$");

This regex should work. But if you just want to check if your line ends with a pipe this regex is more simple:

Pattern dataPattern = Pattern.compile("^.*\\|$");

Upvotes: 0

thejh
thejh

Reputation: 45568

How about this?

str.length() > 1 && str.charAt(str.length()-1) == '|'

Is probably much faster.

Upvotes: 0

Vlad
Vlad

Reputation: 9481

Your regex is wrong it should be:

Pattern dataPattern = Pattern.compile("(.+)\\|$");

Upvotes: 1

FailedDev
FailedDev

Reputation: 26930

^(.*?\|)*$

Try this instead.

"
^        # Assert position at the beginning of the string
(        # Match the regular expression below and capture its match into backreference number 1
   .        # Match any single character that is not a line break character
      *?       # Between zero and unlimited times, as few times as possible, expanding as needed (lazy)
   \\|       # Match the character “|” literally
)*       # Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
\$        # Assert position at the end of the string (or before the line break at the end of the string, if any)
"

Some problems with your regex :

  • Fist it is not repeating, you should repeat the pattern since you have many columns.
  • You match something and then you match tne start of the string. Not possible, this will never match.
  • You always want a character to match but you said there could be empty columns. Instead use * quantifier.

Upvotes: 7

Related Questions