Reputation: 81
Q: If I were given an exorbitantly large filled with random English words and were told to find specific sub-strings cut by a whitespace [For example, "how now", "brown cow", etc.], and then return the position at which it appears, how would I do it?
A: I have a partial solution, but I'm asking the Stack Overflow community for help completing the last bit.
How Program Should Run:
Returns line number and word number; the word number is in relation to line
If "how now" is found as the first two words of two consecutive lines, it would return "how now" found on line k at position 1, and found once again on line k+1 at position 1 as well.
If the line is "how now the count of monte brown cow cristo", then it should be able to detect "how now" and "brown cow" as two separate occurrences.
Solution 1:
int chn = 0;
int cbc = 0;
Scanner in = new Scanner(new File("filename.txt"));
String temp = in.nextLine();
Pattern phn = Pattern.compile("how now");
Pattern pbc = Pattern.compile("brown cow");
Matcher mhn = null;
Matcher mbc = null;
while (in.hasNext()) {
mhn = phn.matcher(temp);
while (mhn.find()) m++;
mbc = pbc.matcher(temp);
while (mbc.find()) j++;
temp = in.nextLine();
} // Formatted output comes after
The thing is while this keeps track of the number of occurrences (chn, cbc) by using Patterns and Matchers and also keeps track of chronological occurrence, and is the fastest algorithm in doing so, I'm at a loss for how I can keep track of where in the line it occurs.
Solution 2:
Scanner in = new Scanner(new File("filename.txt"));
ArrayList<String> wordsInLine = new ArrayList<>();
String temp = in.nextLine();
String temp2 = "";
ctL = 1;
while (in.hasNext()) {
if (temp.contains("how now")) {
for (String word : temp.split(" ")) {
wordsInLine.add(word);
}
for (int i = 0; i < wordsInLine.size(); i++) {
if (wordsInLine.get(i).equals("how") ||
wordsInLine.get(i + 1).equals("now")) {
System.out.println("This returns line count and "
+ "the occurrence by getting i");
}
}
}
ctL++;
temp = in.nextLine();
}
But this second partial solution seems incredibly inefficient and terribly slow, using two for loops for every line that contains "how now."
Is there a more elegant way of doing this?
Upvotes: 2
Views: 1329
Reputation: 2155
Go with Solution 1. Use start, end and group methods to track subsequence matched:
mhn = phn.matcher(temp);
while (mhn.find()) {
System.out.print(mhn.start() + ", ");
System.out.print(mhn.end() + ", ");
System.out.println(mhn.group());
m++;
}
Upvotes: 0
Reputation: 3577
Solution 1 is definitely much more efficient and I would go for that approach for sure.
In order to keep track of the position of the matched pattern in a specific line, you can use the start()
or the end()
method of the Matcher
class to get the corresponding indices.
Upvotes: 2