Reputation: 41
I have a Java class that upload a text file from a Windows client to a Linux server.
The file I am triyng to upload is encoded using Cp1252 or ISO-8859-1.
When the file is uploaded, it become encoded using utf-8, then strings containing accents like éèà can't be read.
The command
file -i *
in the linux server tells me that it's encoded using utf-8.
I think the encoding was changed diring the upload, so I added this code to my servlet:
String currentEncoding=System.getProperty("file.encoding");
System.setProperty("file.encoding", "Cp1252");
item.write(file);
System.setProperty("file.encoding", currentEncoding);
In the jsp file, I have this code:
<form name="formUpload"
action="..." method="post"
enctype="multipart/form-data" accept-charset="ISO-8859-1">
The lib I use to upload a file is apache commun.
Doe's any one have a clue, cause I'm really runnig out of ideas!
Thanks,
Otmane MALIH
Upvotes: 2
Views: 2349
Reputation: 328760
Setting the system property file.encoding
will only work when you start Java. Instead, you will have to open the file with this code:
public static BufferedWriter createWriter( File file, Charset charset ) throws IOException {
FileOutputStream stream = new FileOutputStream( file );
return new BufferedWriter( new OutputStreamWriter( stream, charset ) );
}
Use Charset.forName("iso8859-1")
as charset
parameter.
[EDIT] Your problem is most likely the file
command. MacOS is the only OS in the world which can tell you the encoding of a file with confidence. Windows and Linux have to make a guess. This guess can be wrong.
So what you need to do is to open the file with an editor where you specify the encoding. You need to do that on Windows (to make sure that the file really was saved with Cp1252
; some applications ignore the platform and always safe their data in UTF-8
).
And you need to do the same on Linux. If you just open the file, the editor will take the platform encoding (which is UTF-8
on modern Linux systems) and try to read the file with that -> ISO-8859-1
umlauts will be garbled. But if you open the file with ISO-8859-1
, then UTF-8
will be garbled. That's the only way to be sure what the encoding of a text file really is.
Upvotes: 2