Reputation: 2630
Usually, when I read text files, I do it like this:
File file = new File("some_text_file.txt");
Scanner scanner = new Scanner(new FileInputStream(file));
StringBuilder builder = new StringBuilder();
while(scanner.hasNextLine()) {
builder.append(scanner.nextLine());
builder.append('\n');
}
scanner.close();
String text = builder.toString();
There may be better ways, but this method has always worked for me perfectly.
For what I am working on right now, I need to read a large text file (over 700 kilobytes in size). Here is a sample of the text when opened in Notepad (the one that comes standard with any Windows operating system):
"lang"
{
"Language" "English"
"Tokens"
{
"DOTA_WearableType_Daggers" "Daggers"
"DOTA_WearableType_Glaive" "Glaive"
"DOTA_WearableType_Weapon" "Weapon"
"DOTA_WearableType_Armor" "Armor"
However, when I read the text from the file using the method that I provided above, the output is:
I could not paste the output for some reason. I have also tried to read the file like so:
File file = new File("some_text_file.txt");
Path path = file.toPath();
String text = new String(Files.readAllBytes(path));
... with no change in result.
How come the output is not as expected? I also tried reading a text file that I wrote and it worked perfectly fine.
Upvotes: 1
Views: 370
Reputation: 3212
final Scanner scanner = new Scanner(new FileInputStream(file), "UTF-16");
Upvotes: 1
Reputation: 10908
It looks like encoding problem. Use a tool that can detect encoding to open the file (like Notepad++) and find how it is encoded. Then use the other constructor for Scanner:
Scanner scanner = new Scanner(new FileInputStream(file), encoding);
Or you can simply experiment with it, trying different encodings. It looks like UTF-16 to me.
Upvotes: 2