Reputation: 91
I am writing a compression program, and need to write bit data to a binary file using c++. If anyone could advise on the write statement, or a website with advice, I would be very grateful.
Apologies if this is a simple or confusing question, I am struggling to find answers on web.
Upvotes: 6
Views: 5623
Reputation: 1407
by below class you can write and read bit by bit
class bitChar{
public:
unsigned char* c;
int shift_count;
string BITS;
bitChar()
{
shift_count = 0;
c = (unsigned char*)calloc(1, sizeof(char));
}
string readByBits(ifstream& inf)
{
string s ="";
char buffer[1];
while (inf.read (buffer, 1))
{
s += getBits(*buffer);
}
return s;
}
void setBITS(string X)
{
BITS = X;
}
int insertBits(ofstream& outf)
{
int total = 0;
while(BITS.length())
{
if(BITS[0] == '1')
*c |= 1;
*c <<= 1;
++shift_count;
++total;
BITS.erase(0, 1);
if(shift_count == 7 )
{
if(BITS.size()>0)
{
if(BITS[0] == '1')
*c |= 1;
++total;
BITS.erase(0, 1);
}
writeBits(outf);
shift_count = 0;
free(c);
c = (unsigned char*)calloc(1, sizeof(char));
}
}
if(shift_count > 0)
{
*c <<= (7 - shift_count);
writeBits(outf);
free(c);
c = (unsigned char*)calloc(1, sizeof(char));
}
outf.close();
return total;
}
string getBits(unsigned char X)
{
stringstream itoa;
for(unsigned s = 7; s > 0 ; s--)
{
itoa << ((X >> s) & 1);
}
itoa << (X&1) ;
return itoa.str();
}
void writeBits(ofstream& outf)
{
outf << *c;
}
~bitChar()
{
if(c)
free(c);
}
};
for example
#include <iostream>
#include <sstream>
#include <fstream>
#include <string>
#include <stdlib.h>
using namespace std;
int main()
{
ofstream outf("Sample.dat");
ifstream inf("Sample.dat");
string enCoded = "101000001010101010";
//write to file
cout << enCoded << endl ; //print 101000001010101010
bitChar bchar;
bchar.setBITS(enCoded);
bchar.insertBits(outf);
//read from file
string decoded =bchar.readByBits(inf);
cout << decoded << endl ; //print 101000001010101010000000
return 0;
}
Upvotes: 1
Reputation: 543
For writing binary, the trick I have found most helpful is to store all the binary as a single array in memory and then move it all over to the hard drive. Doing a bit at a time, or a byte at a time, or an unsigned long long at a time is not as fast as having all the data stored in an array and using one instance of "fwrite()" to store it to the hard drive.
size_t fwrite ( const void * ptr, size_t size, size_t count, FILE * stream );
Ref: http://www.cplusplus.com/reference/clibrary/cstdio/fwrite/
In English:
fwrite( [array* of stored data], [size in bytes of array OBJECT. For unsigned chars -> 1, for unsigned long longs -> 8], [number of instances in array], [FILE*])
Always check your returns for validation of success!
Additionally, an argument can be made that having the object type be as large as possible is the fastest way to go ([unsigned long long] > [char]). While I am not versed in the coding behind "fwrite()", I feel the time to convert from the natural object used in your code to [unsigned long long] will take more time when combined with the writing than the "fwrite()" making due with what you have.
Back when I was learning Huffman Coding, it took me a few hours to realize that there was a difference between [char] and [unsigned char]. Notice for this method that you should always use unsigned variables to store the pure binary.
Upvotes: 2
Reputation: 14222
Collect the bits into whole bytes, such as an unsigned char or std::bitset (where the bitset size is a multiple of CHAR_BIT), then write whole bytes at a time. Computers "deal with bits", but the available abstraction – especially for IO – is that you, as a programmer, deal with individual bytes. Bitwise manipulation can be used to toggle specific bits, but you're always handling byte-sized objects.
At the end of the output, if you don't have a whole byte, you'll need to decide how that should be stored. Both iostreams and stdio can write unformatted data using ostream::write and fwrite, respectively.
Instead of a single char or bitset<8> (8 being the most common value for CHAR_BIT), you might consider using a larger block size, such as an array of 4-32, or more, chars or the equivalent sized bitset.
Upvotes: 3