praks5432
praks5432

Reputation: 7792

Java Scanner Changing String

I'm doing some basic file reading from a text file using Scanner.

The first 5 entries are this -

 0 MR2Spyder
1 Tundra
3 Echo
3 Yaris
4 ScionxB
4 ScionxD

I instantiate the scanner normally and then do this -

String line = scanner.nextLine();
System.out.println(line);

I then get this output -

ÿþ0 M R 2 S p y d e r 

Which doesn't make sense to me- is there some problem with the Scanner class? Should I be using BufferedReader?

Upvotes: 1

Views: 1958

Answers (1)

obataku
obataku

Reputation: 29646

Your file is encoded using UTF-16... the spaces between characters and the heading ÿþ is indicative of that -- it is the byte order mark. See here:

if the 16-bit units use little-endian order, the sequence of bytes will have 0xFF followed by 0xFE. This sequence appears as the ISO-8859-1 characters ÿþ in a text display that expects the text to be ISO-8859-1.

You must specify that when constructing your Scanner.

final Scanner scanner = new Scanner(file, "UTF-16");

Upvotes: 6

Related Questions