HoverCatz
HoverCatz

Reputation: 75

Reading file made by cmd, results in 3 weird symbols

Im using this piece of code to read a file to a string, and its working perfectly with files manually made in notepad, notepad++ or other text editors:

std::string utils::readFile(std::string file)
{
    std::ifstream t(file);
    std::string str((std::istreambuf_iterator<char>(t)),
                      std::istreambuf_iterator<char>());
    return str;
}

When I create a file via notepad (or any other editor) and save it to something, I get this result in my program:
ImageShouldBeValid

But when I create a file via CMD (example command below), and run my program, I receive an unexpected result:
ImageShouldBeValid cmd /C "hostname">"C:\Users\Admin\Desktop\lel.txt" & exit Result:
ImageShouldBeValid

When I open this file generated by CMD (lel.txt), this is the file contents:
ImageShouldBeValid

If I edit the generated file (lel.txt) with notepad (adding a space to the end of the file), and try running my program again, I get the same weird 3char result.

What might cause this? How can I read a file made via cmd, correctly?

EDIT
I changed my command (now using powershell), and added a function I found, named SkipBOM, and now it works:

powershell -command "hostname | Out-File "C:\Users\Admin\Desktop\lel.txt" -encoding "UTF8""

SkipBOM:

void SkipBOM(std::ifstream &in)
{
    char test[3] = { 0 };
    in.read(test, 3);
    if ((unsigned char)test[0] == 0xEF &&
        (unsigned char)test[1] == 0xBB &&
        (unsigned char)test[2] == 0xBF)
    {
        return;
    }
    in.seekg(0);
}

Upvotes: 0

Views: 175

Answers (2)

user4958316
user4958316

Reputation:

That is how unicode looks when treated as an ANSI string. In notepad use File - Save As to see what the current format of a file is.

Now CMD uses OEM font, which is the same as ANSI for English characters. So any unicode will be converted to OEM by CMD. Perhaps you are grabbing the data yourself.

In VB you would use StrConv to convert it.

Upvotes: 1

sirgeorge
sirgeorge

Reputation: 6541

This is almost certainly BOM (Byte Order Mark) : see here, which means that your file is saved in UNICODE with BOM. There is a way to use C++ streams to read files with BOM (you have to use converters) - let me know if you need help with that.

Upvotes: 2

Related Questions