user2453973
user2453973

Reputation: 309

How to avoid greedy or non-specific regex?

//---------EDITS MADE FOR CLARITY--------//

I've looked over the rest of the web but so far no answers have satisfied my question. I've got a regex that is being a bit too greedy for what I'm trying to do. For example, the following regex:

(?<piece>q|k|b|p|n|r+)(?<color>l|d)(?<x>\\w)(?<y>\\d)

will match

rdh8

rda6

rla1 a3

rlb2

However, I need my regex to be specific; I need it to exclude "rla1 a3". Currently, the regex matches the 'rla1' portion of the rla1 a3. I need the regex to completely disregard 'rla1 a3' because of the ' a3'.

I have attempted solutions such as \s?$ but these have not worked. Any ideas whats wrong?

*EDIT* //------------------------------------------------------------------------------//

Here is some sample code of the problem.

public ArrayList<String> theStringsToBeRead = new ArrayList<String>();

public void addLines() throws IOException
{
    String line;
    BufferedReader br = new BufferedReader(new FileReader("testFile"));

    //put each line into an arraylist
    while((line = br.readLine()) != null) 
    {
        theStringsToBeRead.add(line.toLowerCase()); //adds each line to the arraylist
    }
    //parse each line in arraylist
    for(String item : theStringsToBeRead)
    {
        testing(item); //for each line, run a regex check
    }
}

public void testing(String item)
{

    String regex = "(?<piece>q|k|b|p|n|r+)(?<color>l|d)(?<x>\\w)(?<y>\\d)";
    //String StevesRegex = "^(?<piece>q|k|b|p|n|r+)(?<color>l|d)(?<x>\\w)(?<y>\\d)$";  //doesnt appear to work
    Matcher m = Pattern.compile(regex).matcher(item);

    while (m.find()) 
    {
        System.out.println(m.group("piece")+m.group("color")+m.group("x")+m.group("y"));
    }   
}

The text file is:

rdh8
rda6
rla1 a3
rlb2

The output I get is:

rdh8
rda6
rla1
rlb2

The desired result is that the regex completely ignores all of "rla1 a3" instead of matching a portion of it. The desired output would then be:

rdh8
rda6
rlb2

Any help would be appreciated, very sorry to have confused anybody as to the question. Thank you for your patience.

Upvotes: 3

Views: 147

Answers (2)

Steve P.
Steve P.

Reputation: 14699

Your question is a little unclear, but if you just want to match at the beginning of the input, then the following would suffice:

String regex = "^(?<piece>q|k|b|p|n|r+)(?<color>l|d)(?<x>\\w)(?<y>\\d)$";

^ says "at the start of the line" $ says "end of the line"

Here's a link to test it online.

EDIT:
Your issue is due to whitespace, to remove whitespace before or after the string, use .trim(), ie:

String s = br.readLine().trim().toLowerCase();

Alternatively, you could change the regex to account for whitespace as follows:

String regex = "^\\s*(?<piece>q|k|b|p|n|r+)(?<color>l|d)(?<x>\\w)(?<y>\\d)\\s*$";

Upvotes: 4

Tim Pierce
Tim Pierce

Reputation: 5664

It looks to me as though this part of your regex matches rdh8:

(?<piece>q|k|b|p|n|r+)(?<color>l|d)(?<x>\\w)(?<y>\\d)

and this is the part that matches rla1 a3:

\s+(.*?)

If you want your regex only to match the first expression it finds and stop matching there, try removing the \s+(.*?) from the end of the regex.

Upvotes: 0

Related Questions