Graveyard
Graveyard

Reputation: 13

Reading a single character from a file returns special characters?

Using fstreams I'm attempting to read single characters from a specified location in a file and append them onto a string. For some reason, reading in these characters returns special characters. I've tried numerous things, but the more curious thing that I found while debugging was that changing the initial value of the char temp; will cause the whole string to change to that value.

int Class::numbers(int number, string& buffer) {
    char temp;

    if (number < 0 || buffer.length() > size) {
        exit(0);
    }

    string fname = name + ".txt";
    int start = number * size;

    ifstream readin(fname.c_str());
    readin.open(fname.c_str(), ios::in)
    readin.seekg(start);

    for (int i = 0; i < size; ++i) {
        readin.get(temp);
        buffer += temp;
    }

    cout << buffer << endl;
    readin.close();
    return 0;
}

Here is an example screenshot of the special characters being outputted: https://i.sstatic.net/eN9Yy.png

Could the issue be where I'm starting using seekg? It seems to start in the appropriate position. Another thing I've considered is that maybe I'm reading some invalid place into the stream and it's just giving me junk characters from memory.

Any thoughts?

WORKING SOLUTION:

int Class::numbers(int number, string& buffer) {
    char temp;

    if (number < 0 || buffer.length() > size) {
        exit(0);
    }

    string fname = name + ".txt";
    int start = number * size;

    ifstream readin(fname.c_str());
    readin.open(fname.c_str(), ios::in)
    readin.seekg(start);

    for (int i = 0; i < size; ++i) {
        readin.get(temp);
        buffer += temp;
    }

    cout << buffer << endl;
    readin.close();
    return 0;
}

Here is the working solution. In my program I had already had this file name open, so opening it twice was likely to cause issues I suppose. I will do some further testing on this in my own time.

Upvotes: 1

Views: 1494

Answers (1)

Cloud
Cloud

Reputation: 19333

For ASCII characters with a numeric value greater than 127, the actual character rendered on screen depends on the code page of the system you are currently using.

What is likely happening is that you are not getting a single "character" as you think you are.

First, to debug this, use your existing code to just open and print out an entire text file. Is your program capable of doing this? If not, it's likely that the "text" file you are opening isn't using ASCII, but possibly UTF or some other form of encoding. That means when you read a "character" (8-bits most likely), you're just reading half of a 16-bit "wide character", and the result is meaningless to you.

For example, the gedit application will automatically render "Hello World" on screen as I'd expect, regardless of character encoding. However, in a hex editor, a UTF8 encoded file looks like:

UTF8 Raw text:

0000000: 4865 6c6c 6f20 776f 726c 642e 0a         Hello world..

While UTF16 looks like:

0000000: fffe 4800 6500 6c00 6c00 6f00 2000 7700  ..H.e.l.l.o. .w.
0000010: 6f00 7200 6c00 6400 2e00 0a00            o.r.l.d.....

This is what your program sees. C/C++ expect ASCII encoding by default. If you want to handle other encodings, it's up to your program to accomodate it manually or by using a third-party library.

Also, you aren't testing to see if you've exceeded the length of the file. You could just be grabbing random garbage.

Using a simple text file just containing the string "Hello World", can your program do this:


Code Listing


// read a file into memory
#include <iostream>     // std::cout
#include <fstream>      // std::ifstream
#include <string.h>

int main () {
    std::ifstream is ("test.txt", std::ifstream::binary);
    if (is) {
        // get length of file:
        is.seekg (0, is.end);
        int length = is.tellg();
        is.seekg (0, is.beg);

        // allocate memory:
        char * buffer = new char [length];

        // read data as a block:
        is.read (buffer,length);
        // print content:
        std::cout.write (buffer,length);
        std::cout << std::endl;

        // repeat at arbitrary locations:
        for (int i = 0; i < length; i++ )
        {
            memset(buffer, 0x00, length);
            is.seekg (i, is.beg);
            is.read(buffer, length-i);
            // print content:
            std::cout.write (buffer,length);
            std::cout << std::endl;
        }

        is.close();
        delete[] buffer;
    }

    return 0;
}

Sample Output


Hello World

Hello World

ello World

llo World

lo World

o World

 World

World

orld

rld

ld

d

Upvotes: 1

Related Questions