masm64
masm64

Reputation: 1272

Uploading files with filenames which include unicode characters won't work

I'm trying to upload files from an html form. I wrote a servlet for it, which should open an inputstream on the received parts and write the data into a file with the same name and extension. First, I had problems with the data itself. For example text files, which had unicode body would not encode the characters properly with UTF-8. Then I've started using DataInputStream and DataOutputStream and for some reason now that is working correctly. What remains is the problem with the filename. If and when the filename has unicode characters the filename itself won't have the right encoding, and some odd characters will appear (as expected). I've tried several things but I don't know how to fix it. I'm using Wildfly 10.0.10.Final. So, for example if my file has the name ááéé.txt, the resulting file name is ááéé.txt.

This is my HTML page:

<html>
<h:head>        
    <meta charset="UTF-8" />
    <meta content="text/html" />
</h:head>
<h:body>
    <div class="container">
        Upload a new file:
        <form enctype="multipart/form-data" method="post" action="upload">
        Files: <input multiple="multiple" id="fileUpload" type="file" name="files" />
        <input type="submit" multiple="multiple" value="upload" />
    </form>
    </div>   
</h:body>
</html>

My servlet is written as below:

@WebServlet(name = "fileUploadServlet", urlPatterns = {"/upload"})
@MultipartConfig
public class FileUploadServlet extends HttpServlet {    
    @Override
    protected void doPost(HttpServletRequest req, HttpServletResponse resp) throws ServletException, IOException {
        req.setCharacterEncoding("UTF-8");
        int n = 0;
        for (Part file : req.getParts()) {
            String fileName = new String(file.getSubmittedFileName().getBytes("UTF-8"), "UTF-8");            
            try (DataInputStream dis = new DataInputStream(file.getInputStream());
                 DataOutputStream dos = new DataOutputStream(new FileOutputStream("E:\\upload\\" + fileName))) {
                byte[] buffer = new byte[1024];
                int r;
                while ((r = dis.read(buffer)) != -1) {
                    dos.write(buffer, 0, r);
                }
                n++;
            }
        }
        resp.getWriter().print(n + " files uploaded.");
    }
}

Thanks in advance!

Upvotes: 2

Views: 3022

Answers (2)

auntyellow
auntyellow

Reputation: 2573

req.setCharacterEncoding(...) sometimes does not work.

If you are using Tomcat, set URIEncoding="UTF-8" in Connector section in your server.xml, e.g.

<Connector port="80" protocol="HTTP/1.1" maxThreads="150" connectionTimeout="20000" enableLookups="false"
    URIEncoding="UTF-8" redirectPort="443" />

There may be a similar setting in Wildfly, I guess.

Upvotes: 0

maximwirt
maximwirt

Reputation: 36

Seems like the WildFly implementation doesn't use request's characted encoding. I found a solution:

String filename = new String(part.getSubmittedFileName().getBytes("ISO-8859-1"), "UTF-8");

Upvotes: 2

Related Questions