Dog
Dog

Reputation: 2906

How to read contents of a multipart file inputstream in Java

I have a Thymeleaf html form that accepts an uploaded file as input and then makes a post request to a Java controller for the multipart file. I then convert the file into an inputstream. While I am able to read the file's size and input type, I am not able to successfully print out the contents.

For example, for a .doc file, if I try methods I have found to print out the file's contents, it merely prints a series of numbers. Which I'm assuming is an encoding. Does a method exist to print out the contents of an uploaded .doc file?

The controller action I'm currently using to attempt to print out the file's contents is:

@PostMapping("/file-upload")
    public String uploadFile(@RequestParam("fileUpload") MultipartFile fileUpload, Model model) throws IOException {
        InputStream fis = fileUpload.getInputStream();

        for (int i = 0; i < fis.available(); i++) {
            System.out.println("" + fis.read());
        }

        return "home";
}

And the form I am using to submit the file is:

                        <form th:action="@{/file-upload}" enctype="multipart/form-data" method="POST">
                            <div class="container">
                                <div class="row" style="margin: 1em;">
                                    <div class="col-sm-2">
                                        <label for="fileUpload">Upload a New File:</label>
                                    </div>
                                    <div class="col-sm-6">
                                        <input type="file" class="form-control-file" id="fileUpload" name="fileUpload">
                                    </div>
                                    <div class="col-sm-4">
                                        <button type="submit" class="btn btn-dark">Upload</button>
                                    </div>
                                </div>
                            </div>
                        </form>

Upvotes: 0

Views: 8838

Answers (1)

VGR
VGR

Reputation: 44404

Do not use InputStream.available(). From the documentation:

It is never correct to use the return value of this method to allocate a buffer intended to hold all data in this stream.

Only getting value of -1 from read() indicates the end of the InputStream.

For example, for a .doc file, if I try methods I have found to print out the file's contents, it merely prints a series of numbers. Which I'm assuming is an encoding.

Your assumption is incorrect. A .doc file is complex binary format, not just a text encoding. (Try opening a .doc file in Notepad.)

You are getting numbers because you are printing numbers. InputStream.read() returns an int. "" + fis.read() converts each returned int to a String.

If you really want to print the contents of the file, write the bytes directly:

int b;
while ((b = fis.read()) >= 0) {
    System.out.write(b);
}

If you’re using Java 9 or later, you can just use:

fis.transferTo(System.out);

However, neither option will show the contents of a Word document in a readable form. You will need a library that can read the text content from a Word file, like Apache POI. (There are other libraries available; you may want to search for them.)

Upvotes: 2

Related Questions