Reputation: 197
I am writing a code to spot country names in the text. I am using a dictionary with names of countries say India, America, Sri Lanka, ...
. I am currently using text.contains(key)
with key
from the dictionary. However, this returns true even for a string like Indiana
. I tried putting the words of the sentence in an array and then doing the contains, similar approach can be considered with equals but they are really slow. Is there any other faster way you could think of?
Upvotes: 2
Views: 11177
Reputation: 388316
Try to use word boundary class \b
s.matches(".*\\b" + key + "\\b.*")
Upvotes: 9
Reputation: 4107
Maybe you should be using some text processing library.
Here is a regex solution:
import java.util.regex.*;
import static java.lang.System.*;
public class SO {
public static void main(String[] args) {
String[] dict={"india","america"};
String patStr=".*\\b(" + combine(dict,"|") + ")\\b.*";
out.println("pattern: "+patStr+"\n");
Pattern pat=Pattern.compile(patStr);
String input1="hello world india indiana";
out.println(input1+"\t"+pat.matcher(input1).matches());
String input2="hello world america americana";
out.println(input2+"\t"+pat.matcher(input2).matches());
String input3="hello world indiana amercana";
out.println(input3+"\t"+pat.matcher(input3).matches());
}
static String combine(String[] s, String glue){
int k=s.length;
if (k==0) return null;
StringBuilder out=new StringBuilder();
out.append(s[0]);
for (int x=1;x<k;++x)
out.append(glue).append(s[x]);
return out.toString();
}
}
Output:
pattern: .*\b(india|america)\b.*
hello world india indiana true
hello world america americana true
hello world indiana amercana false
Upvotes: 1
Reputation: 5919
contains()
should have worked. You can also try String.indexOf(String)
. If it returns anything other than -1, that query string exists in the said String, otherwise not.
Upvotes: 0