Reputation: 7792
I'm doing some basic file reading from a text file using Scanner.
The first 5 entries are this -
0 MR2Spyder
1 Tundra
3 Echo
3 Yaris
4 ScionxB
4 ScionxD
I instantiate the scanner normally and then do this -
String line = scanner.nextLine();
System.out.println(line);
I then get this output -
ÿþ0 M R 2 S p y d e r
Which doesn't make sense to me- is there some problem with the Scanner class? Should I be using BufferedReader?
Upvotes: 1
Views: 1958
Reputation: 29646
Your file is encoded using UTF-16... the spaces between characters and the heading ÿþ
is indicative of that -- it is the byte order mark. See here:
if the 16-bit units use little-endian order, the sequence of bytes will have
0xFF
followed by0xFE
. This sequence appears as the ISO-8859-1 charactersÿþ
in a text display that expects the text to be ISO-8859-1.
You must specify that when constructing your Scanner
.
final Scanner scanner = new Scanner(file, "UTF-16");
Upvotes: 6