ianmayo
ianmayo

Reputation: 2402

Fix Regular Expression to allow optional fields

A data-line looks like this:

$POSL,VEL,SPL,,,4.1,0.0,4.0*12

The 7th field (4.1) is extracted to the named field SPEED using this Java Regexp.

\\$POSL,VEL,SPL,,,(?<SPEED>\\d+.\\d+),.*

New data has slightly changed. The fields in 4,5,6 may now contain data:

$POSL,VEL,SPL,a,b,c,4.0,a,b,c,d

But, the Regexp is now returning zero. Note: fields 4, 5, 6 may contain letters or numbers. But, they will not contain quoted Strings (so we don't need to worry about quoted commas).

Can someone offer a fix please?

Upvotes: 1

Views: 239

Answers (3)

Kousik Mandal
Kousik Mandal

Reputation: 720

Assuming in first input one , was missing.

    package arraysAndStrings;
    
    import java.util.regex.Matcher;
    import java.util.regex.Pattern;
    
    public class RegexGroupCapture {
        public static void main(String[] args) {
            String inputArr[] = { "$POSL,VEL,SPL,,,,4.1,0.0,4.0*12",
                    "$POSL,VEL,SPL,a,b,c,4.0,a,b,c,d" };
            for (String input : inputArr) {
                System.out.println(extractSpeed(input));
            }
        }
    
        private static float extractSpeed(String input) {
            float speed = 0;
            try {
                String regex = "\\$POSL,VEL,SPL,.*?,.*?,.*?,(?<SPEED>\\d+.\\d+),.*";
                Pattern pattern = Pattern.compile(regex);
                Matcher matcher = pattern.matcher(input);
                if (matcher.find()) {
                    speed = Float.parseFloat(matcher.group(1));
                }
            } catch (Exception e) {
                e.printStackTrace();
            }
            return speed;
        }
    }

Output
=====
4.1
4.0

Upvotes: 0

The fourth bird
The fourth bird

Reputation: 163362

You could optionally repeat chars a-zA-Z and digits using ,[A-Za-z0-9]*

As there is 1 comma more in the second string, you can make that part optional.

If you are not interested in the last part, but only in the capturing group, you can omit .* at the end. If the value can also occur at the end of the string, you can end the pattern with an alternation (?:,|$)

Note to escape the dot in this part \\d+\\.\\d+

\$POSL,VEL,SPL,[A-Za-z0-9]*,[A-Za-z0-9]*,(?:[A-Za-z0-9]*,)?(?<SPEED>\d+\.\d+)(?:,|$)

In Java with double escaped backslashes

String regex = "\\$POSL,VEL,SPL,[A-Za-z0-9]*,[A-Za-z0-9]*,(?:[A-Za-z0-9]*,)?(?<SPEED>\\d+\\.\\d+)(?:,|$)";

Regex demo

Upvotes: 1

azro
azro

Reputation: 54148

You may use \w+ for any digit/letter, for the fields 4, 5, 6

\\$POSL,VEL,SPL,\\w*,\\w*,\\w*,(?<SPEED>\\d+.\\d+),.*

REGEX DEMO


Note that in your post, the example and the regex may miss a comma to get the numbre as seventh field


Upvotes: 1

Related Questions