Reputation: 6908
currently I'm desperately trying to write german umlauts, read from the console, into a utf8 encoded text file on windows 7.
Here is the code to setup the scanner:
Scanner scanner = new Scanner(System.in, "UTF8");
Here is the code to read the string:
String s = scanner.nextLine();
Here is the code to write into a file:
OutputStreamWriter osw = new OutputStreamWriter(new FileOutputStream(this.targetFile), "UTF8");
osw.write(s);
Unfortunately, instead of example "überraschung" the so written file is encoded in utf8 but will not display the umlaut. What to do?
Upvotes: 2
Views: 18613
Reputation: 4398
This worked for me, with german umlauts:
import java.io.BufferedReader;
import java.io.FileOutputStream;
import java.io.InputStreamReader;
import java.io.OutputStreamWriter;
public class P {
public static void main(String[] args) throws Exception {
BufferedReader stdin = new BufferedReader(new InputStreamReader(System.in));
String s = stdin.readLine();
OutputStreamWriter osw = new OutputStreamWriter(new FileOutputStream("D:/p.txt"), "UTF-8");
osw.write(s);
osw.close();
}
}
Upvotes: 0
Reputation: 2236
I had a similar problem (The String "ç" would not be "detected" by the Scanner and Strings like "Açores" would have the ç character "garbled").
I solved it by declaring the charset for the language:
Scanner keyboardReader = new Scanner(System.in, "iso-8859-1");
Upvotes: 2
Reputation: 21995
Your console probably is not UTF-8, so when you do new Scanner(System.in, "UTF8");
you are creating a scanner with the wrong encoding, and your umlauts are lost when you try to read lines from the console.
You may want to use chcp
on a console prompt to check what code page is being used.
In fact, you might not need to specify an encoding at all. If you just create the scanner as new Scanner(System.in)
, the default platform encoding should be used.
Upvotes: 3