Xerix
Xerix

Reputation: 441

Correct decoding of base64 to blob?

I am working on a project using websocket, and trying to upload selected file to the server.

Using FileReader.readAsDataURL I can pick a file and retrieve its base64 encoding. The problem starts in the server side, when I use PERL MIME:: Base64 decode_base64, even though I get a binary file without any error, its length is longer in 24 bytes for an XLS file or 19 bytes for a ZIP file, and empty file when it is a RAR file.

Checking the binary result I got that the "additional bytes" are all in the beginning of a file giving no meaning at all.

I.E.: Test.XLS file size is 29696 bytes, after decoding in the server is 29720 bytes, the 24 "header bytes" are (in hex): 75 AB 5A 6A 9A 65 89 C6 AD 8A FF BE 77 66 B1 EC 5C 7A 56 DA B1 EE B8. With these bytes, the file is corrupted. Without these bytes, the file is OK.

THE PERL DECODE ALGORITHM:

use MIME::Base64;
if(open(TXT,"$filepath.tmp")) { # Temporary file contains previously uploaded base64 text encoded using JS FileReader.readAsDataURL
    my @V=<TXT>; close(TXT);
    if(open(DFL,">$filepath")) {
        binmode(DFL);
        print DFL decode_base64(join('',@V)); close(DFL);
        # response back to the client
    }
    else {  } # error response was removed as not relevant for this question
}

MY QUESTIONS:

  1. What I am missing? Maybe I shall "crop" the header bytes? "24" or "19" is just "private event", maybe for other cases the "header" will be different, I have no idea about it.
  2. Base64 does not have a standard among all files?
  3. I tried decoding only after the "," the pure base64 characters, but it produces error.

Upvotes: 2

Views: 2282

Answers (2)

Applying a replace: data:application/vnd.ms-excel;base64,

In my case is ReactJS, example:

this.state.archivo_csv.replace('data:application/vnd.ms-excel;base64,','')

Upvotes: 1

Steffen Ullrich
Steffen Ullrich

Reputation: 123461

You problem is outside the code you show. FileReader.readAsDataURL does not contain purely the base64 representation of the data, but instead a data-URL which looks like this:

 data:application/octet-stream;base64,...base64-encoded-data...

If you feed this into the base64 decoder it will attempt to use everything as base64 and ignore any characters which are invalid for base64. Therefore you get some bytes in front of the real content.

You need to fix this issue either in your Javascript code before sending the data or in the Perl code. In Perl you could simply strip everything in front of the real base64, i.e.

 s{\A.*?;base64,}{}s

Based on your comment the first bytes of your input file are:

 data:application/vnd.ms-excel;base64,

This is the part you need to strip from the file, the base64 code comes only after this prefix. If you try interpret these data as base64 instead you get the following bytes (as hex)

 75 ab 5a 6a 9a 65 89 c6  ad 8a 89 ff be 77 66 b1 ec 5c 7a 56 da b1 ee b8

which is exactly what you see as the invalid header in your decoded output.

Upvotes: 4

Related Questions