IAE
IAE

Reputation: 2243

String gets junk on end after conversion to c_str()

This is a homework assignment, just for all that want to know.

I'm writing a vocabulary translator (english -> german and vice versa) and am supposed to save everything the user does to file. Simple enough.

This is the code:

std::string file_name(user_name + ".reg");
std::ifstream file(file_name.c_str(), std::ios::binary | std::ios::ate);
// At this point, we have already verified the file exists. This shouldn't ever throw!
// Possible scenario:  user deletes file between calls.
assert( file.is_open() );

// Get the length of the file and reset the seek.
size_t length = file.tellg();
file.seekg(0, std::ios::beg);

// Create and write to the buffer.
char *buffer = new char[length];
file.read(buffer, length);
file.close();

// Find the last comma, after which comes the current dictionary.
std::string strBuffer = buffer;
size_t position = strBuffer.find_last_of(',') + 1;
curr_dict_ = strBuffer.substr(position);

// Start the trainer; import the dictionary.
trainer_.reset( new Trainer(curr_dict_.c_str()) );

The problem is, apparently, the curr_dict_ which is supposed to store my dictionary value. For example, my teacher has one dictionary file named 10WS_PG2_P4_de_en_gefuehle.txt. The Trainer imports the entire contents of the dictionary file like so:

std::string s_word_de;
std::string s_word_en;
std::string s_discard;
std::string s_count;
int i_word;

std::ifstream in(dictionaryDescriptor);

if( in.is_open() )
{
    getline(in, s_discard); // Discard first line.
    while( in >> i_word &&
        getline(in, s_word_de, '<') &&
        getline(in, s_discard, '>') &&
        getline(in, s_word_en, '(') &&
        getline(in, s_count, ')') )
    {   
        dict_.push_back(NumPair(s_word_de.c_str(), s_word_en.c_str(), Utility::lexical_cast<int, std::string>(s_count)));
    }
}
else
    std::cout << dictionaryDescriptor;

And a single line is written like so

1             überglücklich <-> blissful                     (0) 

The curr_dict_ seems to import fine, but when outputting it I get a whole bunch of garbage characters at the end of the file!

I even used a hex editor to make sure that my file containing the dictionary didn't contain excess characters at the end. It didn't.

The registry file the top code is reading for the dictionary:

Christian.reg

Christian,abc123,10WS_PG2_P4_de_en_gefuehle.txt

What am I doing wrong?

Upvotes: 1

Views: 1249

Answers (2)

Loki Astari
Loki Astari

Reputation: 264401

I would do this:

std::string strBuffer(length, '\0');
myread(file, &strBuffer[read], length); // guranteed to read length bytes from file into buffer

Avoid the need for an intermediate buffer completely.

Upvotes: 1

lijie
lijie

Reputation: 4871

the read function (as in the line file.read(buffer, length);) does not nul-terminate the character buffer. You'll need to manually do it (allocate one more character, and place the nul at the gcountth position after reading).

Upvotes: 3

Related Questions