a4w
a4w

Reputation: 577

Encode a big file using base64

I'm trying to encode/decode a file that is several MBs or sometimes GBs in base64 encoding however some pieces of data gets encoded/decoded in a wrong way which results in strange characters like: � �̴.

I'm reading the file chunk by chunk encoding and saving each individually (Probably that's the problem however i cannot figure it out).

Here is what i have tried so far:

<?php

function encode_file($Ifilename, $Efilename){
    $handle = fopen($Ifilename, 'rb');
    $outHandle = fopen($Efilename, 'wb');
    $bufferSize = 8151;
    while(!feof($handle)){
        $buffer = fread($handle, $bufferSize);
        $ebuffer = base64_encode($buffer);
        fwrite($outHandle, $ebuffer);
    }
    fclose($handle);
    fclose($outHandle);
}

function decode_file($Ifilename, $Efilename){
    $handle = fopen($Ifilename, 'rb');
    $outHandle = fopen($Efilename, 'wb');
    $bufferSize = 8151;
    while(!feof($handle)){
        $buffer = fread($handle, $bufferSize);
        $dbuffer = base64_decode($buffer);
        fwrite($outHandle, $dbuffer);
    }
    fclose($handle);
    fclose($outHandle);
}

encode_file('input.txt', 'out.bin');//Big text file ~4MBs

decode_file('out.bin', 'out.txt');

Upvotes: 1

Views: 3341

Answers (1)

a4w
a4w

Reputation: 577

After reading the whole Wikipedia article on base64, I found that every 3 characters encodes to 4 base64 characters, this is what was causing the file corruption.

The fix is to simply set the buffer to n when encoding, where n is a multiple of 3.

When decoding set the buffer to N, where N is a multiple of 4.

The working code:

<?php
function encode_file($Ifilename, $Efilename){
    $handle = fopen($Ifilename, 'rb');
    $outHandle = fopen($Efilename, 'wb');
    $bufferSize = 3 * 256;// 3 bytes of ASCII encodes to 4 bytes of base64
    while(!feof($handle)){
        $buffer = fread($handle, $bufferSize);
        $ebuffer = base64_encode($buffer);
        fwrite($outHandle, $ebuffer);
    }
    fclose($handle);
    fclose($outHandle);
}

function decode_file($Ifilename, $Efilename){
    $handle = fopen($Ifilename, 'rb');
    $outHandle = fopen($Efilename, 'wb');
    $bufferSize = 4 * 256; // 4 bytes of base64 decodes to 3 bytes of ASCII
    while(!feof($handle)){
        $buffer = fread($handle, $bufferSize);
        $dbuffer = base64_decode($buffer);
        fwrite($outHandle, $dbuffer);
    }
    fclose($handle);
    fclose($outHandle);
}

encode_file('input.txt', 'out.bin');

decode_file('out.bin', 'output.txt');

Upvotes: 1

Related Questions