Reputation: 169
I have a private method that I'm testing and provided below,
private boolean containsExactDrugName(String testString, String drugName) {
Matcher m = Pattern.compile("\\b(?:" + drugName + ")\\b|\\S+", Pattern.CASE_INSENSITIVE | Pattern.UNICODE_CASE).matcher(testString);
ArrayList<String> results = new ArrayList<>();
while (m.find()) {
results.add(m.group());
}
boolean found = results.contains(drugName);
return found;
}
I take a text String
and medication name provided inside the method and returns boolean
. I need it to be case insensitive
and the last assertion
of the test is failing. The test is provided below,
@Test
public void test_getRiskFactors_givenTextWith_Orlistat_Should_Not_Find_Medication() throws Exception {
String drugName = "Orlistat";
assertEquals("With Orlistat", true, containsExactDrugName("The patient is currently being treated with Orlistat", drugName));
assertEquals("With Orlistattesee", false, containsExactDrugName("The patient is currently being treated with Orlistattesee", drugName));
assertEquals("With abcOrlistat", false, containsExactDrugName("The patient is currently being treated with abcOrlistat", drugName));
assertEquals("With orlistat", true, containsExactDrugName("The patient is currently being treated with orlistat", drugName));
}
In the last assertion the drug name is in lower case orlistat
but still needs to match with the provided parameter Orlistat
. I used Pattern.CASE_INSENSITIVE
, however its not working. How to write the code properly ?
Upvotes: 1
Views: 2652
Reputation: 18233
The problem isn't mainly in your regular expression, it's the containsExactDrugName
method itself. You're doing case-insensitive matching to find the drugName
within the larger string, but then you look for an exact match of the drugName
within the resulting list of matched strings:
results.contains(drugName)
This check is not only redundant (since the regex already did the work of finding the matches), it's actively breaking your function, because once again you're checking for an exact, case-sensitive match. Simply get rid of that:
private boolean containsExactDrugName(String testString, String drugName) {
Matcher m = Pattern.compile("\\b(?:" + drugName + ")\\b", Pattern.CASE_INSENSITIVE | Pattern.UNICODE_CASE).matcher(testString);
List<String> results = new ArrayList<>();
while (m.find()) {
results.add(m.group());
}
return !results.isEmpty();
}
Actually, since you're not keeping track of the number of times you've found drugName
, the entire list is pointless, and you can simplify your method to:
private boolean containsExactDrugName(String testString, String drugName) {
Matcher m = Pattern.compile("\\b(?:" + drugName + ")\\b", Pattern.CASE_INSENSITIVE | Pattern.UNICODE_CASE).matcher(testString);
return m.find();
}
Edit - Your regex is also too permissive. It's matching on \\S+
, which means any sequence of 1 or more non-space characters. I'm not sure why you included that, but it's causing your regex to match things that are not the drugName
. Remove the |\\S+
section of the expression.
Upvotes: 2
Reputation: 3709
You need (?i) before the of the pattern that you want to make case insensitive
Change your regex from
\\b(?:" + drugName + ")\\b|\\S+
to this
(?i)\\b(" + drugName + ")\\b|\\S+
Upvotes: 1