David Merinos
David Merinos

Reputation: 1297

Read a binary file where every element is a 2 byte integer

I have a binary file with the .b16 extension on it, the information contained is as unsigned integer (range 0..65535, file extension .b16, byte order: low-byte/high-byte).

Main files are otypes03.b08 to otypes08.b08, otypes09.b16 and otypes10.b16. They contain the coordinates of all inequivalent point sets (order types) for the given number n of points.

I've succesfully read all the files with .b08 extension, however when it comes to read the .b16 files I don't get the expected information.

What I have so far: (This is a modified version of the reading algorithm exclusively to be used for the .b16 format)

int readPoints(int n, string file_name, vector<Point> & vPoints){
    ifstream input(file_name, std::ios::binary);
    if(input.fail()) return 1;

    vector< unsigned char> buffer(std::istreambuf_iterator<char>(input), {});
    //Copying each pair of binary points to a vector of Point objects
    Point temp;
    for( unsigned int i=0;i< buffer.size();i+=4){
        temp.x = buffer[i] | buffer[i+1]  ;
        temp.y = buffer[i+2] | buffer[i+3]  ;
        vPoints.push_back(temp);
    }
    return 0;
}

Every element of the file is a coordinate of a point in the plane, however it seems that I'm reading wrong, the coordinates read are not the ones that should be. I don't know what I'm doing wrong.

What I use for the .b08 format:

//Reads a file of binary points and stores it on vector vPoints.
int readPoints(int n, string file_name, vector<Point> & vPoints){
    ifstream input(file_name, std::ios::binary);
    if(input.fail()) return 1;
    // copies all data into buffer
    //Stored as unsigned int. Arithmetic operations (+-*/) can be used! :)
    //Can be treated as signed int or unsigned int.
    vector< unsigned char> buffer(std::istreambuf_iterator<char>(input), {});
    //Copying each pair of binary points to a vector of Point objects
    Point temp;
    cout << "Buffer size: " << buffer.size() << endl;
    for( unsigned int i=0;i< buffer.size();i+=2){
        temp.x = buffer[i];
        temp.y = buffer[i+2];
        vPoints.push_back(temp);
    }
    return 0;
}

More information on the database I'm using is here: http://www.ist.tugraz.at/aichholzer/research/rp/triangulations/ordertypes/

The file I'm trying to read is otypes09.b16 which is 5.7MB just in case you want to try it out.

Thanks for your time.

Upvotes: 0

Views: 1279

Answers (3)

Mike
Mike

Reputation: 302

In cases like this, depending on meanimg of the data, and having absolute resolution I like to uses unions. What you could do is have a union that has an int member as well as a struct containing 2 shorts. Each short will hold the binary structure of the 2 16 bit ints.

Having said that, the above answers may do just fine for you. Many ways of doing things like this, so design the right api for you!

Upvotes: 0

Jeremy Friesner
Jeremy Friesner

Reputation: 73294

for( unsigned int i=0;i< buffer.size();i+=4){
    temp.x = buffer[i] | buffer[i+1]  ;
    temp.y = buffer[i+2] | buffer[i+3]  ;
    vPoints.push_back(temp);
}

The above is incorrect -- you're OR-ing the bits of the upper 8 bits on top of the lower 8 bits, which corrupts the data. You need to shift the bits first (which bits you need to shift will depend on whether the file stores its 16-bit words in big-endian format or little-endian format).

If the file's data is in little-endian format, this should work:

// read in little-endian 16-bit words
for( unsigned int i=0;i< buffer.size();i+=4){
    temp.x = ((unsigned short)buffer[i+0]) | (((unsigned short)buffer[i+1])<<8);
    temp.y = ((unsigned short)buffer[i+2]) | (((unsigned short)buffer[i+3])<<8);
    vPoints.push_back(temp);
}

... or if the file's data is stored in big-endian format, it would be more like this:

// read in big-endian 16-bit words
for( unsigned int i=0;i< buffer.size();i+=4){
    temp.x = (((unsigned short)buffer[i+0])<<8) | ((unsigned short)buffer[i+1]);
    temp.y = (((unsigned short)buffer[i+2])<<8) | ((unsigned short)buffer[i+3]);
    vPoints.push_back(temp);
}

Upvotes: 4

Thomas Matthews
Thomas Matthews

Reputation: 57749

Here is one method:

uint8_t lsb;
uint8_t msb;
uint16_t value;
std::vector<uint16_t> database;
//...
input.read((char *) &lsb, sizeof(lsb));
input.read((char *) &msb, sizeof(msb));
value = msb * 256 + lsb;
database.push_back(value);

Since this is a binary file, the read method is used. You could replace the value assignment with:
value = msb << 8 | lsb;
Although a good compiler should translate the first value assignment into the second.

Upvotes: 1

Related Questions