F Brooks
F Brooks

Reputation: 5

Extraneous data when reading/writing binary data

I am trying to write a process that will retrieve a file (various types - pdf, txt, docx, tif, etc.) via a rest API call, convert that file's binary data from base64 encoding to un-encoded, and write the file to another location (to be picked up by another process). All of the above is working, but if the file type is anything other than txt, the newly written out file will not open.

public File retrieveDocument(String in_ItemId, File in_DestinationFile, Map<String, String> in_DocumentProperties)
        throws IOException {
    byte[] binaryData = new byte[8198];

    try {
        String url = "filestore url";

        RestTemplate restTemplate = new RestTemplate();
        List<HttpMessageConverter<?>> messageConverters = new ArrayList<HttpMessageConverter<?>>();
        messageConverters.add(new MappingJacksonHttpMessageConverter());
        restTemplate.getMessageConverters().add(new StringHttpMessageConverter());
        restTemplate.setMessageConverters(messageConverters);

        Map documentMap = restTemplate.getForObject(url, Map.class);

        if (documentMap.get("binaryData") != null) {
            binaryData = Base64.decodeBase64(((String) documentMap.get("binaryData")).getBytes());
        }

        OutputStream outputStream = new BufferedOutputStream(new FileOutputStream(in_DestinationFile));
        outputStream.write(binaryData);
        outputStream.close();
    } catch (Exception e) {
        e.printStackTrace();
    }

    return in_DestinationFile;
}

When I open both the original and new files in a text editor (i.e., Notepad++) and compare the two, there are a number of additional characters (mainly question marks) in the new file. Below is an example from a tif image. I've added ^ under some of the additional characters in the new file.

Original file:

    II* P  €?à@$
    „BaP¸d6ˆDbQ8¤V-ŒFcQ¸äv=HdR9$–M'”JeR¹d¶]/˜LfS9¤Öm7œNgS¹äö}? PhT:%3Ñ©Tºe6O¨

    ‡ÄbqX¼f7•ß²<¦W-—ÌfsY¼æw=ŸÐhlÐ=—M§ÔjuZ½f·]¯Øll™-–×m·Ünw[½æ÷}¿à_tœ'Çäry\¾g7hÚsú]

New file:

    II* P  €?à@$
    „BaP¸d6ˆDbQ8¤V-ŒFcQ¸äv=?HdR9$–M'”JeR¹d¶]/˜LfS9¤Öm7œNgS¹äö}? PhT:%3Ñ©Tºe6?O¨
                           ^                                                ^
    ‡ÄbqX¼f7?•ß²<¦W-—ÌfsY¼æw=ŸÐhlÐ=—M§ÔjuZ½f·]¯Øll™-–×m·Ünw[½æ÷}¿à_tœ'?Çäry\¾g7?hÚsú]
            ^                                                                  ^

Any ideas as to what I'm doing wrong?

Upvotes: 0

Views: 135

Answers (1)

Kayaman
Kayaman

Reputation: 73558

Writer classes including PrintWriter are for text data. Stream classes such as OutputStream are for binary data.

You're converting binary data into a String at which point some binary data can get corrupted.

Get rid of the String strBinaryData and just the byte[] you get from Base64.decodeBase64.

Upvotes: 2

Related Questions