L4nce0
L4nce0

Reputation: 67

Split a binary file into chunks c++

I've been bashing my head against trying to first divide up a file into chunks, for the purpose of sending over sockets. I can read / write a file easily without splitting it into chunks. The code below runs, works, kinda. It will write a textfile and has a garbage character. Which if this was just for txt, no problem. Jpegs aren't working with said garbage.

Been at it for a few days, so I've done my research, and it's time to get some help. I do want to stick strictly to binary readers, as this need to handle any file.

I've seen a lot of slick examples out there. (none of them worked for me with jpgs) Mostly something along the lines of while(file)... I subscribe to the, if you know the size, use a for-loop, not a while-loop camp.

Thank you for the help!!

vector<char*> readFile(const char* fn){
    vector<char*> v;
    ifstream::pos_type size;
    char * memblock;
    ifstream file;
    file.open(fn,ios::in|ios::binary|ios::ate);
    if (file.is_open()) {
        size = fileS(fn);
        file.seekg (0, ios::beg);
        int bs = size/3; // arbitrary. Actual program will use the socket send size
        int ws = 0;
        int i = 0;
        for(i = 0; i < size; i+=bs){
            if(i+bs > size)
                ws = size%bs;
            else
                ws = bs;
            memblock = new char [ws];
            file.read (memblock, ws);
            v.push_back(memblock);
        }
    }
    else{
        exit(-4);
    }
    return v;
}


int main(int argc, char **argv) {
    vector<char*> v = readFile("foo.txt");
    ofstream myFile ("bar.txt", ios::out | ios::binary);
    for(vector<char*>::iterator it = v.begin(); it!=v.end(); ++it ){
        myFile.write(*it,strlen(*it));
    }
}

Upvotes: 2

Views: 4972

Answers (3)

Fluvid
Fluvid

Reputation: 268


The problem is that you are using a strlen to calculate the size of array to be written. A 0 to be a part of binary there you would not be writing the right size. Instead, use a pair of char*,int where int specifies the size that is to be written and you will be golden. Like:

#include <iostream>
#include <vector>
#include <fstream>
#include <stdlib.h>
#include <string.h>
using namespace std;

ifstream::pos_type fileS(const char* fn)
{
    ifstream file;
        file.open(fn,ios::in|ios::binary);
    file.seekg(0, ios::end);
    ifstream::pos_type ret= file.tellg();
    file.seekg(0,ios::beg);
    ret=ret-file.tellg();
    file.close();
    return ret;
}

vector< pair<char*,int> > readFile(const char* fn){
    vector< pair<char*,int> > v;
    ifstream::pos_type size;
    char * memblock;
    ifstream file;
    file.open(fn,ios::in|ios::binary|ios::ate);
    if (file.is_open()) {
    size = fileS(fn);
    file.seekg (0, ios::beg);
    int bs = size/3; // arbitrary. Actual program will use the socket send size
    int ws = 0;
    int i = 0;
    cout<<"size:"<<size<<" bs:"<<bs<<endl;
    for(i = 0; i < size; i+=bs){
        if(i+bs > size)
            ws = size%bs;
        else
            ws = bs;
        cout<<"read:"<<ws<<endl;
        memblock = new char [ws];
        file.read (memblock, ws);
        v.push_back(make_pair(memblock,ws));
    }
    }
    else{
    exit(-4);
    }
    return v;
}


int main(int argc, char **argv) {
    vector< pair<char*,int> > v = readFile("a.png");
    ofstream myFile ("out.png", ios::out | ios::binary);
    for(vector< pair<char*,int> >::iterator it = v.begin(); it!=v.end(); ++it ){
    pair<char*,int> p=*it;
    myFile.write(p.first,p.second);
    }
}

Upvotes: 2

rici
rici

Reputation: 241701

You should never do this:

    myFile.write(*it,strlen(*it));

on binary data. strlen counts bytes until it hits a byte which contains a 0 (NUL as we like to say, but it's an honest 0). If you read enough binary data, you will hit a NUL, and you'll get a short count. But actually the situation could be a lot worse, because nowhere do you store the NUL for strlen to find. You're just counting on there being one beyond the end of the datablock you acquire to read the file into.

So don't do that. Remember the number of bytes in each block (you could use a vector> but there are a lot of more C++-like possibilities) and use that to write the data.

Upvotes: 0

Alan
Alan

Reputation: 46813

 myFile.write(*it,strlen(*it));

Is using string length on binary data. I suspect that is your culprit. If not, it's certainly a code-smell.

Upvotes: 1

Related Questions