Reputation: 571
I have to check whether the words from File1 exist in File2 or not and then count. Data in both files is shown below.
The words in File1 are like shown below:
The data in File2 is like shown below:
The code I have written is as follows:
File file1 = new File("ChineseWord.txt");
Scanner sc = new Scanner(new FileInputStream(file1));
ArrayList<String> list = new ArrayList<String>();
ArrayList<String> newList = new ArrayList<String>();
while(sc.hasNext()){
list.add(sc.next());
}
sc.close();
File file2 = new File("RandomData.txt");
Scanner newScanner = new Scanner(new FileInputStream(file2));
int count = 0;
for (int i = 0; i < list.size(); i++) {
while(newScanner.hasNext()){
String word = newScanner.nextLine();
String toMatch = list.get(i);
if(word.contains(toMatch)){
System.out.println("Success");
count++;
}
}
String test = list.get(i);
newList.add(test+"exists" + count+ "times");
count =0;
}
The question is that it returns 0 for all words whereas the very first word in File1 exists in the very first line of File2. If I manually do something like this
if(word.contains("发表")){
System.out.println("Success");
count++;
}
It prints successs otherwise it does not ? Why is this so?
Upvotes: 1
Views: 1933
Reputation:
The problem is within your logic, because you loop over each words in list
, but your scanner on "File2" is only created once outside this list
-loop.
You probably should move the list-loop just over the if (word.contains(toMatch))
.
Following your comment, I did a quick test with:
package so36862093;
import com.google.common.io.Resources;
import java.io.File;
import java.io.FileInputStream;
import java.nio.file.Files;
import java.util.*;
public class App {
public static void main(final String[] args) throws Exception {
final File file1 = new File(Resources.getResource("so36862093/ChineseWord.txt").toURI());
final List<String> list = Files.readAllLines(file1.toPath());
final File file2 = new File(Resources.getResource("so36862093/RandomData.txt").toURI());
final Scanner newScanner = new Scanner(new FileInputStream(file2));
final Map<String, Integer> count = new HashMap<>();
while(newScanner.hasNext()){
final String word = newScanner.nextLine();
for (String toMatch : list) {
if(word.contains(toMatch)){
System.out.println("Success");
count.put(toMatch, count.getOrDefault(toMatch, 0) + 1);
}
}
}
for (Map.Entry<String, Integer> e : count.entrySet()) {
System.out.println(e.getKey() + " exists " + e.getValue() + " times.");
}
}
}
and in ChineseText.txt
(UTF-8)
发表
发愁
发达
发抖
发挥
and in RandomData.txt
(UTF-8):
The output is
Success
发表 exists 1 times.
Follow-up: I played a little with the project you shared, and the issue is that you have a non breaking space U+65279 at the start of every line (I did not).
So, you should probably "strip" that character before anything else.
Upvotes: 2
Reputation: 103
I think your problem is in the encoding:
Scanner newScanner = new Scanner(new FileInputStream(file2),"UNICODE");
Try that:
File file1 = new File("data/ChineseWord.txt");
Scanner sc = new Scanner(new FileInputStream(file1),"UNICODE");
ArrayList<String> list = new ArrayList<String>();
ArrayList<String> newList = new ArrayList<String>();
while(sc.hasNext()){
list.add(sc.next());
}
sc.close();
File file2 = new File("data/RandomData.txt");
Scanner newScanner = new Scanner(new FileInputStream(file2),"UNICODE");
int count = 0;
for (int i = 0; i < list.size(); i++) {
while(newScanner.hasNext()){
String word = newScanner.nextLine();
String toMatch = list.get(i);
if(word.contains(toMatch)){
System.out.println("Success");
count++;
}
}
String test = list.get(i);
newList.add(test+"exists" + count+ "times");
count =0;
}
Upvotes: 0
Reputation: 741
Right now you are reading the entire file and then comparing that with the 1st element in your list, where it should be the other way around, Read 1st line from file2 and compare that with entire list.
Change your code to ->
while(newScanner.hasNext()){
String word = newScanner.nextLine();
for (int i = 0; i < list.size(); i++) {
String toMatch = list.get(i);
if(word.contains(toMatch)){
System.out.println("Success");
count++;
}
}
}
Upvotes: 2