Reputation: 542
Goal
My goal is to quickly create a file from a large binary string (a string that contains only 1 and 0).
Straight to the point
I need a function that can achieve my goal. If I am not clear enough, please read on.
Example
Test.exe is running...
.
Inputted binary string:
1111111110101010
Writing to: c:\users\admin\desktop\Test.txt
Done!
File(Test.txt) In Byte(s):
0xFF, 0xAA
.
Test.exe executed successfully!
Explanation
I've tried
As an fail attempt to achieve my goal, I've created this simple (and possibly horrible) function (hey, at least I tried):
void BinaryStrToFile( __in const char* Destination,
__in std::string &BinaryStr )
{
std::ofstream OutputFile( Destination, std::ofstream::binary );
for( ::UINT Index1 = 0, Dec = 0;
// 8-Bit binary.
Index1 != BinaryStr.length( )/8;
// Get the next set of binary value.
// Write the decimal value as unsigned char to file.
// Reset decimal value to 0.
++ Index1, OutputFile << ( ::BYTE )Dec, Dec = 0 )
{
// Convert the 8-bit binary to hexadecimal using the
// positional notation method - this is how its done:
// http://www.wikihow.com/Convert-from-Binary-to-Decimal
for( ::UINT Index2 = 7, Inc = 1; Index2 + 1 != 0; -- Index2, Inc += Inc )
if( BinaryStr.substr( Index1 * 8, 8 )[ Index2 ] == '1' ) Dec += Inc;
}
OutputFile.close( );
};
Example of usage
#include "Global.h"
void BinaryStrToFile( __in const char* Destination,
__in std::string &BinaryStr );
int main( void )
{
std::string Bin = "";
// Create a binary string that is a size of 9.53674 mb
// Note: The creation of this string will take awhile.
// However, I only start to calculate the speed of writing
// and converting after it is done generating the string.
// This string is just created for an example.
std::cout << "Generating...\n";
while( Bin.length( ) != 80000000 )
Bin += "10101010";
std::cout << "Writing...\n";
BinaryStrToFile( "c:\\users\\admin\\desktop\\Test.txt", Bin );
std::cout << "Done!\n";
#ifdef IS_DEBUGGING
std::cout << "Paused...\n";
::getchar( );
#endif
return( 0 );
};
Problem
Again, that was my fail attempt to achieve my goal. The problem is the speed. It is too slow. It took more than 7 minutes. Are there any method to quickly create a file from a large binary string?
Thanks in advance,
CLearner
Upvotes: 3
Views: 1355
Reputation: 7858
Even though late, I want to place my example for handling such strings. Architecture specific optimizations may use unaligned loads of chars into multiple registers for 'squeezing' out the bits in parallel. This untested example code does not check the chars and avoids alignment and endianness requirements. It assumes the characters of that binary string to represent contiguous octets (bytes) with the most significant bit first, not words and double words, etc., where their specific representation in memory (and in that string) would require special treatment for portability.
//THIS CODE HAS NEVER BEEN TESTED! But I hope you get the idea.
//set up an ofstream with a 64KiB buffer
std::vector<char> buffer(65536);
std::ofstream ofs("out.bin", std::ofstream::binary|std::ofstream::out|std::ofstream::trunc);
ofs.rdbuf()->pubsetbuf(&buffer[0],buffer.size());
std::string::size_type bits = Bin.length();
std::string::const_iterator cIt = Bin.begin();
//You may treat cases, where (bits % 8 != 0) as error
//Initialize with the first iteration
uint8_t byte = uint8_t(*cIt++) - uint8_t('0');
byte <<= 1;
for(std::string::size_type i = 1;i < (bits & (~std::string::size_type(0x7)));++i,++cIt)
{
if(i & 0x7) //bit 7 ... 1
{
byte |= uint8_t(*cIt) - uint8_t('0');
byte <<= 1;
}
else //bit 0: write and advance to the the next most significant bit of an octet
{
byte |= uint8_t(*cIt) - uint8_t('0');
ofs.put(byte);
//advance
++i;
++cIt;
byte = uint8_t(*cIt) - uint8_t('0');
byte <<= 1;
}
}
ofs.flush();
Upvotes: 1
Reputation: 490128
I think I'd consider something like this as a starting point:
#include <bitset>
#include <fstream>
#include <algorithm>
int main() {
std::ifstream in("junk.txt", std::ios::binary | std::ios::in);
std::ofstream out("junk.bin", std::ios::binary | std::ios::out);
std::transform(std::istream_iterator<std::bitset<8> >(in),
std::istream_iterator<std::bitset<8> >(),
std::ostream_iterator<unsigned char>(out),
[](std::bitset<8> const &b) { return b.to_ulong();});
return 0;
}
Doing a quick test, this processes an input file of 80 million bytes in about 6 seconds on my machine. Unless your files are much larger than what you've mentioned in your question, my guess is this is adequate speed, and the simplicity is going to be hard to beat.
Upvotes: 2
Reputation: 2442
This make a 76.2 MB (80,000,000 bytes) file of 1010101010101......
#include <stdio.h>
#include <iostream>
#include <fstream>
using namespace std;
int main( void )
{
char Bin=0;
ofstream myfile;
myfile.open (".\\example.bin", ios::out | ios::app | ios::binary);
int c=0;
Bin = 0xAA;
while( c!= 80000000 ){
myfile.write(&Bin,1);
c++;
}
myfile.close();
cout << "Done!\n";
return( 0 );
};
Upvotes: -1
Reputation: 44503
I'd suggest removing the substr
call in the inner loop. You are allocating a new string and then destroying it for each character that you process. Replace this code:
for(::UINT Index2 = 7, Inc = 1; Index2 + 1 != 0; -- Index2, Inc += Inc )
if( BinaryStr.substr( Index1 * 8, 8 )[ Index2 ] == '1' )
Dec += Inc;
by something like:
for(::UINT Index2 = 7, Inc = 1; Index2 + 1 != 0; -- Index2, Inc += Inc )
if( BinaryStr[Index1 * 8 + Index2 ] == '1' )
Dec += Inc;
Upvotes: 4
Reputation:
So instead of converting back and forth between std::string
s, why not use a bunch of machine word-sized integers for fast access?
const size_t bufsz = 1000000;
uint32_t *buf = new uint32_t[bufsz];
memset(buf, 0xFA, sizeof(*buf) * bufsz);
std::ofstream ofile("foo.bin", std::ofstream::binary);
int i;
for (i = 0; i < bufsz; i++) {
ofile << hex << setw(8) << setfill('0') << buf[i];
// or if you want raw binary data instead of formatted hex:
ofile.write(reinterpret_cast<char *>(&buf[i]), sizeof(buf[i]));
}
delete[] buf;
For me, this runs in a fraction of a second.
Upvotes: 1
Reputation: 140569
Something not entirely unlike this should be significantly faster:
void
text_to_binary_file(const std::string& text, const char *fname)
{
unsigned char wbuf[4096]; // 4k is a good size of "chunk to write to file"
unsigned int i = 0, j = 0;
std::filebuf fp; // dropping down to filebufs may well be faster
// for this problem
fp.open(fname, std::ios::out|std::ios::trunc);
memset(wbuf, 0, 4096);
for (std::string::iterator p = text.begin(); p != text.end(); p++) {
wbuf[i] |= (1u << (CHAR_BIT - (j+1)));
j++;
if (j == CHAR_BIT) {
j = 0;
i++;
}
if (i == 4096) {
if (fp.sputn(wbuf, 4096) != 4096)
abort();
memset(wbuf, 0, 4096);
i = 0;
j = 0;
}
}
if (fp.sputn(wbuf, i+1) != i+1)
abort();
fp.close();
}
Proper error handling left as an exercise.
Upvotes: 1
Reputation: 5138
The majority of your time is spent here:
for( ::UINT Index2 = 7, Inc = 1; Index2 + 1 != 0; -- Index2, Inc += Inc )
if( BinaryStr.substr( Index1 * 8, 8 )[ Index2 ] == '1' ) Dec += Inc;
When I comment that out the file is written in seconds. I think you need to finetune your conversion.
Upvotes: 3