Reputation: 598
I want to read an xml file from the internet. You can find it here.
The problem is that it is encoded in UTF-8 and I need to store it into a file in order to parse it later. I have already read a lot of topics about that and here is what I came up with :
BufferedReader in;
String readLine;
try
{
in = new BufferedReader(new InputStreamReader(url.openStream(), "UTF-8"));
BufferedWriter out = new BufferedWriter(new FileWriter(file));
while ((readLine = in.readLine()) != null)
out.write(readLine+"\n");
out.close();
}
catch (UnsupportedEncodingException e)
{
e.printStackTrace();
}
catch (IOException e)
{
e.printStackTrace();
}
This code works until this line : <title>Chérie FM</title>
When I debug, I get this : <title>Ch�rie FM</title>
Obviously, there is something I fail to understand, but it seems to me that I followed the code saw on several website.
Upvotes: 2
Views: 3421
Reputation: 40333
This file is not encoded as UTF-8
, it's ISO-8859-1
.
By changing your code to:
BufferedReader in;
String readLine;
try
{
in = new BufferedReader(new InputStreamReader(url.openStream(), "ISO-8859-1"));
BufferedWriter out = new BufferedWriter(new OutputStreamWriter( new FileOutputStream(file) , "UTF-8"));
while ((readLine = in.readLine()) != null)
out.write(readLine+"\n");
out.flush();
out.close();
}
catch (UnsupportedEncodingException e)
{
e.printStackTrace();
}
catch (IOException e)
{
e.printStackTrace();
}
You should have the expected result.
Upvotes: 8
Reputation: 9941
If you need to write a file in a given encoding, use FileOutputStream instead.
in = new BufferedReader(new InputStreamReader(url.openStream(), "UTF-8"));
FileOutputStream out = new FileOutputStream(file);
while ((readLine = in.readLine()) != null)
write((readLine+"\n").getBytes("UTF-8"));
out.close();
Upvotes: -1