AlwaysNeedingHelp
AlwaysNeedingHelp

Reputation: 1945

Using regex to check if start of a string matches a pattern

I'm reading lines in from a .txt file, and I need to check if each line is 'valid'.

A valid line starts with a number between -2 and 2 inclusive, and is then followed by a single whitespace, and then potentially text.

I want to use regex for this, but am having trouble getting it working. I am quite unfamiliar with regex. Here's my code:

public static List<Sentence> readFile(String filename) {
        List<Sentence> sentences = new LinkedList<>();

        Pattern pattern = Pattern.compile("-[0-2] abc");
        Matcher matcher;
        try (BufferedReader br = new BufferedReader(new FileReader(filename))) {

            while (br.ready()){
                matcher = pattern.matcher(br.readLine());
                if (matcher.matches()){
                    System.out.print("matches ");
                }
            }

        } catch (IOException e){
            e.printStackTrace();
        }

        return sentences;

    }

This isn't working (no suprise). Could someone help me out getting the correct regex expression?

Upvotes: 0

Views: 2412

Answers (3)

SKLTFZ
SKLTFZ

Reputation: 950

if test strings are separated line by line, then you can validate test strings line by line by

foreach (string line in lines)
{
    match = Regex.Match(line, @"^(-?[1-2]\s.*|0\s.*)", RegexOptions.IgnoreCase);
    if (match.Success)
    {
        MessageBox.Show(match.Groups[1].Value);
    }
}

It validate the test string, and capture the valid string.

As mentioned it works only if the test strings are separated by line.

To modify the regular expression to cater a full string separated by "\n"

It should be

string regExp = @"(-?[1-2]\s.+[\n]{1}|(?<!-)0\s.+[\n]{1})";
MatchCollection matches = Regex.Matches(longstr, regExp, RegexOptions.IgnoreCase);
foreach(Match match in matches)
{
    if (match.Success)
    {
        MessageBox.Show(match.Groups[1].Value);
    }
}

The concern about full string is, you can no longer apply ^ or & in the expression.

Negative match will be occurred and capture the substring "0 is not valid" from "-0 is not valid" if ^ is removed from the expression

Thus (?<!-) is required to it to ignore - as the first char when the following char is 0

Upvotes: 0

jackielpy
jackielpy

Reputation: 82

If I am not getting your question wrong, you will need something like this as your pattern:

-?[0-2]\s[\w\d\s]*

Try RegExr, it is a very good web based tool for figuring out regex patterns.

Upvotes: 0

Dillon Davis
Dillon Davis

Reputation: 7740

You might be looking for a regex similar to the following:

^(0|-?[1-2]) .*

The ^ symbol matches the beginning of a line, the (0|...) matches 0 or the following expression, the -? matches 0 or 1 occurrence of -, [1-2] matches 1, or 2, matches a whitespace, and .* matches 0 or more of anything but a newline.

Upvotes: 3

Related Questions