jabal
jabal

Reputation: 12367

servlet file upload filename encoding

I am using the Apache Commons Fileupload tools for standard file upload. My problem is that I cannot get the proper filename of uploaded files if they contain special characters (á, é, ú, etc.) They all get converted to ? signs.

request.getCharacterEncoding() says UTF-8, but the bytes I get in the string fileItem.getName() are all the same for all my special characters.

Can you help me what's wrong?

(Some details: using Firefox 3.6.12, Weblogic 10.3 on Windows)

This is my code snippet:

 public CommandMsg(HttpServletRequest request) {
    Enumeration names = null;
    if (isMultipart(request)) {
      FileItemFactory factory = new DiskFileItemFactory();
      ServletFileUpload upload = new ServletFileUpload(factory);
      try {
        List uploadedItems = upload.parseRequest(request);
        Iterator i = uploadedItems.iterator();
        FileItem fileItem = null;
        while (i.hasNext()) {
          fileItem = (FileItem) i.next();
          if (fileItem.isFormField()) {
            // System.out.println("isFormField");
            setAttribute(fileItem.getFieldName(), fileItem.getString());
          } else {
            String enc = "utf-8";
            enc = request.getCharacterEncoding();
            String fileName = fileItem.getName();
            byte[] fnb = fileItem.getName().getBytes();
            byte[] fnb2 = null;
            try {
                fnb2 = fileItem.getName().getBytes(enc);
                String t1 = new String(fnb);
                String t2 = new String(fnb2);
                String t3 = new String(fnb, enc);
                String t4 = new String(fnb2, enc);
            } catch (UnsupportedEncodingException e) {
                e.printStackTrace();
            }
            setAttribute(fileItem.getFieldName(), fileItem);
          }
        }
      } catch (FileUploadException ex) {
        ex.printStackTrace();
      }

// etc..

Upvotes: 19

Views: 27860

Answers (4)

Jeevi
Jeevi

Reputation: 3042

For these special charecters, u can set the Encoding to "iso 8859-1". The UTF-8 seems to be not working..

If u r not setting any encoding type.. Then linux machine will take the default encoding which is UTF-8 and windows will take the compatible encoding

Upvotes: 0

Christoph
Christoph

Reputation: 4000

I had the same problem and solved it like this.

ServletFileUpload upload = new ServletFileUpload(factory);
upload.setHeaderEncoding("UTF-8"); 

FileItemIterator iter = upload.getItemIterator(request);
while (iter.hasNext()) {
    FileItemStream item = iter.next();
    String name = item.getFieldName();
    InputStream stream = item.openStream();
    if (item.isFormField()) {
        String value = Streams.asString(stream, "UTF-8");
    } 
}

If you based your code on the example provided in http://commons.apache.org/fileupload/streaming.html then you need to make sure you set UTF-8 in two places above.

Upvotes: 18

jabal
jabal

Reputation: 12367

Solved the problem by calling ServletFileUpload instance's .setHeaderEncoding("ISO-8858-2") explicitly.

Upvotes: 2

BalusC
BalusC

Reputation: 1109362

You need to ensure that the target console/file/database/whatever where you're printing/writing/inserting the file name to supports UTF-8 as well. The question marks indicate that it isn't configured to accept UTF-8 and that the target itself is aware of that. Otherwise you would just have seen mojibake.

Since the detail about the target is missing in the question, I can't do much more than suggesting to get yourself through this article to understand what's going on with characters behind the scenes.

Upvotes: 2

Related Questions