Reputation: 87
I'm trying so search a number of lines of a text file using a user input (string 'question'), but I need to exclude some common terms as they bring up erroneous results as well as correct ones.
try {
readName file = new readName(file_name);
String[] aryLines = file.OpenFile();//OPEN KNOWLEDGE BASE
//SEARCH EACH ENTRY IN KNOWLEDGE BASE
for (int i = 0; i < aryLines.length; i = i + 1) {
String delims = "[ ]+";//SPLIT ITEMS INTO TOKENS
//CREATE ARRAY OF TOKENS
String[] tokens = aryLines[i].split(delims);
//SEARCH THROUGH TOKENS
for (int j = 0; j < tokens.length; j = j + 1){
//MATCH QUESTIONS AGAINST TOKENS AND EXCLUSIONS
if (question.matches("(.*)" + tokens[j] + "(.*)")) {
System.out.println(aryLines[i]);
}
}
}
} catch (Exception e) {
System.out.println(e);
}
I have tried putting in
if (question.matches("(.*)" + tokens[j] + "(.*)")
&& !question.matches(*excluded word*))
But in that case, it produces no result when the search question is entered. Both versions work correctly when the excluded term in omitted from the search question.
I have hunted around on here and in other places, but nothing's working for me so far. Any help much appreciated!
This is a sample of my knowledge base
Dogs have tails
Donkeys have no humps
If I search for no tails
, it outputs both lines, but I would like to force it to exclude no
from the search so that it only returns Dogs have tails
Upvotes: 0
Views: 2723
Reputation: 2833
I think this is what you are after:
//THIS IS AN EXAMPLE KNOWLEDGE BASE
String[] aryLines = {"Dogs have tails","Donkeys have no humps"};
//THIS IS THE QUESTION SUPPLIED BY POSTER
String question = "no tails";
//IT SEEMS THAT POSTER WANTS TO EXCLUDE CERTAIN WORDS FROM THE SEARCH
String exclude = "no";
//REMOVE ALL OCCURRENCES OF THE EXCLUDE STRING IN QUESTION
question = question.replaceAll(exclude, "");
//FOR EACH TOKEN (FROM KNOWLEDGE BASE)
for(String token : aryLines) {
//MATCH QUESTION AGAINST TOKENS
if (token.matches("(.*)" + question + "(.*)")) {
System.out.println(token);
}
}
In this example, I remove all occurrences of the excluded string in the question. I then compare the tokens to the regex: .*<question>.*
.
Since the excluded strings have been removed prior to the comparison, they will no longer affect the outcome of the match, as the code will compare Dogs have tails
and Donkeys have no humps
to .*tails.*
.
Upvotes: 1