unicon
unicon

Reputation: 1

How to check text file original encoding VC++ or MFC

I use CStdioFile to read text file and output is string but I want to check original encoding file when I choose file in dialog How can I check original encoding?

//This is my code

if(dlg.DoModal() == IDOK)
{

    path = dlg.GetPathName(); //get file path
    CStdioFile pStdioFile1(path, CFile::modeRead);  
    char buff[BUFSIZ];

    while(!feof(pStdioFile1.m_pStream))
        {

            pStdioFile1.ReadString(Buff); //Buff is read text to string  
            msg += Buff;

            if(!feof(pStdioFile1.m_pStream))
            {
                msg += "\n";
            }

        }

Upvotes: 0

Views: 1020

Answers (2)

Patrick
Patrick

Reputation: 23629

Check the BOM (Byte Order Mark) of the file (see http://en.wikipedia.org/wiki/Byte_order_mark).

If the file does not contain a BOM, assume it's an 8-bit ANSI file.

Otherwise, the BOM indicates the format of the file. Check the link, it contains a nice table of the different BOM's and their meaning.

Upvotes: 1

Jerry Coffin
Jerry Coffin

Reputation: 490408

You can't. In some cases the data will contain indications of the encoding used, but you can't really depend on it. Windows does provide IstextUnicode to give you a guess at whether some text is unicode (in this case meaning UTF-16) or not, but 1) it's only good for Unicode, and 2) the result is only a guess anyway.

As an aside, I'm not excited about your code for reading the whole file into a string. Assuming the file is expected to be fairly small, I'd normally use something like:

std::ifstream in(dlg.GetPathName());
std::stringstream buffer;
buffer << in.rdbuf();

// now the content of the file is availble as `buffer.str()`.

Upvotes: 0

Related Questions