Reputation: 21
I'm working with some files that might be either UTF-8 or ANSI (Cp1252 specifically), and I need to load them, make some edits, and then output the file again with the original encoding. However, I haven't had any luck getting my program to output ANSI at all.
My code for loading the text is a simple Scanner
with a charsetName
specified
fileScanner = new Scanner(f, CHARACTER_SET);
My current code for writing the file is the following:
BufferedWriter writer = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(file), CHARACTER_SET));
writer.write(this.toString());
System.out.println("Writing " + name + " (" + method + ") using " + CHARACTER_SET + " encoding");
writer.close();
CHARACTER_SET
is a String that is either "UTF8" or "windows-1252" depending on which encoding I detected the file to be when loading it.
The file actually outputs just fine in either mode, with all the special accent characters I've encountered being uncorrupted. The problem is that if I work on an Cp1252 file, it will output it as UTF-8 even though I initialized the BufferedWriter with a Cp1252 OutputStreamWriter. I can verify this since the encoding was set via CHARACTER_SET
, and I print out CHARACTER_SET
right afterwards showing that for ANSI files it used Cp1252. I'm checking the encoding of the output by loading it in Notepad++ and seeing what it says in the bottom right.
It know seems like I'm splitting hairs a little, but I really do want to leave the file with its original encoding.
Upvotes: 0
Views: 256
Reputation: 21
Well, I'm not 100% sure how this works but I changed my write statement to the following
writer.write(new String(this.toString().getBytes(Charset.forName(CHARACTER_SET))));
and now it works.
I think what's happening is that file contents were being loaded correctly, but then re-encoded by Java's internal String format. In order to have it write the file in the format I wanted, I had to convert the text from Java's format into Cp1252 before printing it, even though I initially loaded it as Cp1252.
In conclusion, it seems that the issue was not with loading the text, or setting up the BufferedWriter, but rather it was with the text I was telling the BufferedWriter to write.
Upvotes: 0