waltersworks
waltersworks

Reputation: 31

How can I make my regex lookahead return the desired values?

I have been attempting to get a regex lookahead conditional to work, but am having some trouble. I've consulted the forums, to no avail.

I am parsing some data where a field can contain one or more numbers; I want to use regex to create a conditional whereby if the field contains any whitespace then the last number is returned, else if the cell value does not contain any whitespace the whole field is returned.

Example field:

6,300.69 22,359.06 11,712.20 40,371.95 0.00 0.00 0.00 40,371.95

Example regex: /(?(?=\s)\s+\S*$|.*)/

So again, IF "\s" is found, THEN apply "\s+\S*$", ELSE apply ".*"

Individually, each regex returns what I would expect.

\s returns the first whitespace.

\s+\S*$ returns any non-whitespace characters after the final whitespace.

.* returns the whole field

My syntax appears correct according to regex101.com and regular-expressions.info/conditional.html.

Any idea what I'm doing wrong?

My regex /(?(?=\s)\s+\S*$|.*)/ returns the following right now:

Example field 1:

6,300.69 22,359.06 11,712.20 40,371.95 0.00 0.00 0.00 40,371.95

Returns:

6,300.69 22,359.06 11,712.20 40,371.95 0.00 0.00 0.00 40,371.95

Example field 2:

40,371.95

Returns:

40,371.95

Upvotes: 2

Views: 111

Answers (2)

The fourth bird
The fourth bird

Reputation: 163297

Your pattern with the conditional tests that if there is a whitespace char directly to the right of the current position, then match 1 or more whitespace chars, else match the whole line.

It can be written as just \s+\S*$|.* so you will always match the whole line due to .* if the first part of the alternation is not matching.

You can match 1 or more non whitespace characters till the end of the string without any lookarounds or word boundaries.

\S+$

Regex demo

Upvotes: 0

WJS
WJS

Reputation: 40034

Here is a Java solution. This works for both Strings s1 and s2. The main difference is that I am using a look-behind.

  • (?<=\\s)

String s1 = "6,300.69 22,359.06 11,712.20 40,371.95 0.00 0.00 0.00 40,371.95";
String s2 = s1.replace(" ","");  // remove spaces
 
String ss = "((?:^[\\S]*$)|(?:(?<=\\s)\\S+$))";

Matcher m =  Pattern.compile(ss).matcher(s1);   
if (m.find()) {
     System.out.println(m.group(1));
}

m =  Pattern.compile(ss).matcher(s2);   
if (m.find()) {
     System.out.println(m.group(1));
}

prints

40,371.95
6,300.6922,359.0611,712.2040,371.950.000.000.0040,371.95

For non-Java use, the regex would be:

/((?:^[\S]*$)|(?:(?<=\s)\S+$))/


Upvotes: 0

Related Questions