Nikitin Mikhail
Nikitin Mikhail

Reputation: 3019

Matcher can't match

I have the following code. I need to check the text for existing any of the words from some list of banned words. But even if this word exists in the text matcher doesn't see it. here is the code:

final ArrayList<String> regexps = config.getProperty(property);
   for (String regexp: regexps){
   Pattern pt = Pattern.compile("(" + regexp + ")", Pattern.CASE_INSENSITIVE);
   Matcher mt = pt.matcher(plainText);                        
   if (mt.find()){
      result = result + "message can't be processed because it doesn't satisfy the rule " + property;
      reason = false;
      System.out.println("reason" + mt.group() + regexp);
                        }
                    }

What is wrong? This code can'f find regexp в[ыy][шs]лит[еe], which is regexp in the plainText = "Вышлите пожалуйста новый счет на оплату на Санг, пока согласовывали, уже прошли его сроки. Лиценз...". I also tried another variants of the regexp but everything is useless

Upvotes: 0

Views: 229

Answers (3)

Narendra Yadala
Narendra Yadala

Reputation: 9664

Try this to filter out messages which contain banned words using the following regex which uses OR operator.

private static void findBannedWords() {
    final ArrayList<String> keywords = new ArrayList<String>();
    keywords.add("f$%k");
    keywords.add("s!@t");
    keywords.add("a$s");

    String input = "what the f$%k";

    String bannedRegex = "";
    for (String keyword: keywords){
        bannedRegex =  bannedRegex + ".*" + keyword + ".*" + "|";
    }

    Pattern pt = Pattern.compile(bannedRegex.substring(0, bannedRegex.length()-1));
    Matcher mt = pt.matcher(input);
    if (mt.matches()) {
         System.out.println("message can't be processed because it doesn't satisfy the rule ");
    }
}

Upvotes: 0

Lee Fogg
Lee Fogg

Reputation: 795

Use ArrayList's built in functions indexOf(Object o) and contains(Object o) to check if a String exists anywhere in the Array and where. e.g.

ArrayList<String> keywords = new ArrayList<String>();
keywords.add("hello");
System.out.println(keywords.contains("hello"));
System.out.println(keywords.indexOf("hello"));

outputs:
true
0

Upvotes: 0

ctn
ctn

Reputation: 2930

The trouble is elsewhere.

import java.util.regex.*;

public class HelloWorld {

    public static void main(String []args) {
        Pattern pt = Pattern.compile("(qwer)");
        Matcher mt = pt.matcher("asdf qwer zxcv");
        System.out.println(mt.find());
    }
}

This prints out true. You may want to use word boundary as delimiter, though:

import java.util.regex.*;

public class HelloWorld {

    public static void main(String []args) {
        Pattern pt = Pattern.compile("\\bqwer\\b");
        Matcher mt = pt.matcher("asdf qwer zxcv");
        System.out.println(mt.find());
        mt = pt.matcher("asdfqwer zxcv");
        System.out.println(mt.find());
    }
}

The parenthesis are useless unless you need to capture the keyword in a group. But you already have it to begin with.

Upvotes: 1

Related Questions