Reputation: 15052
Here's the code that I've worked upon:
while ((lineContents = tempFileReader.readLine()) != null)
{
String lineByLine = lineContents.replaceAll("/\\.", System.getProperty("line.separator")); //for matching /. and replacing it by new line
changer.write(lineByLine);
Pattern pattern = Pattern.compile("\\r?\\n"); //Find new line
Matcher matcher = pattern.matcher(lineByLine);
while(matcher.find())
{
Pattern tagFinder = Pattern.compile("word"); //Finding the word required
Matcher tagMatcher = tagFinder.matcher(lineByLine);
while(tagMatcher.find())
{
score++;
}
scoreTracker.add(score);
score = 0;
}
}
My sample input contains 6 lines, with occurences of word
being [0,1,0,3,0,0]
So when I print scoreTracker
(which is an ArrayList
) I want the above output.
But instead, I get [4,4,4,4,4,4]
which it the total occurence of the word
, but not line by line.
Kindly help.
Upvotes: 2
Views: 1230
Reputation: 61
The original code was reading the input one line at a time using tempFileReader.readLine()
and then looking for end of lines within each line using matcher
. Since lineContents
contains only one line, matcher
never finds a new line so the rest of the code is skipped.
Why do you need two different bits of code to split the input into lines?
You could remove one of the bits of code relating to finding the new lines. E.g.
while ((lineContents = tempFileReader.readLine()) != null)
{
Pattern tagFinder = Pattern.compile("word"); //Finding the word required
Matcher tagMatcher = tagFinder.matcher(lineContents);
while(tagMatcher.find())
{
score++;
}
scoreTracker.add(score);
score = 0;
}
I've tried the code above using a file test.txt on Windows read by a BufferedReader
. E.g.
BufferedReader tempFileReader = new BufferedReader(new FileReader("c:\\test\\test.txt"));
scoreTracker contains [0, 1, 0, 3, 0, 0] for a file which has the content you describe.
I don't understand how you got [4,4,4,4,4,4] out of the original code if the sample input is an actual file as described and tempFileReader
is a BufferedReader
. It would be useful to see the code you use to set up tempFileReader
.
Upvotes: 1
Reputation: 9664
lineByLine
points to the entire contents of your file. That is the reason you are getting [4,4,4,4,4,4]
. You need to store each line in another variable line
and then use tagFinder.find(line)
.
Final code will look like this
while ((lineContents = tempFileReader.readLine()) != null)
{
String lineByLine = lineContents.replaceAll("/\\.", System.getProperty("line.separator")); //for matching /. and replacing it by new line
changer.write(lineByLine);
Pattern pattern = Pattern.compile(".*\\r?\\n"); //Find new line
Matcher matcher = pattern.matcher(lineByLine);
while(matcher.find())
{
Pattern tagFinder = Pattern.compile("word"); //Finding the word required
//matcher.group() returns the input subsequence matched by the previous match.
Matcher tagMatcher = tagFinder.matcher(matcher.group());
while(tagMatcher.find())
{
score++;
}
scoreTracker.add(score);
score = 0;
}
}
Upvotes: 3
Reputation: 8874
You can use Scanner class. You initialize the Scanner to the string you want to count and then just count how many these tokens Scanner finds.
And you can initialize Scanner directly with the FileInputStream.
The resulting code has only 9 lines:
File file = new File(fileName);
Scanner scanner = new Scanner(file);
scanner.useDelimiter("your text here");
int occurences;
while(scanner.hasNext()){
scanner.next();
occurences++;
}
scanner.close();
Upvotes: 0
Reputation: 790
This is because each time you are searching the same string (lineByLine). what you probably intended was to search each line separately. I suggest you do:
Pattern tagFinder = Pattern.compile("word"); //Finding the word required
for(String line : lineByLine.split("\\n")
{
Matcher tagMatcher = tagFinder.matcher(line);
while(tagMatcher.find())
score++;
scoreTracker.add(score);
score = 0;
}
Upvotes: 1
Reputation: 46943
Maybe this code will help you:
String str = "word word\n \n word word\n \n word\n";
Pattern pattern = Pattern.compile("(.*)\\r?\\n"); //Find new line
Matcher matcher = pattern.matcher(str);
while(matcher.find())
{
Pattern tagFinder = Pattern.compile("word"); //Finding the word required
Matcher tagMatcher = tagFinder.matcher(matcher.group());
int score = 0;
while(tagMatcher.find())
{
score++;
}
System.out.print(score + " ");
}
The output is 2 0 2 0 1
It is not highly optimized, but your problem was that you never restricted the inner matching and it always scanned the whole line.
Upvotes: 1