Marwie
Marwie

Reputation: 3327

Getting exception when writing UTF-8 BOM

I have to manually add a UTF-8 BOM to a simple text file. However, I'm not able to write the BOM with the following method. With my rather limited c++ knowledge I actually do not understand what I am doing wrong. I assume that it must be related to the fact that I only write 3 bytes - and the system expects me to write multiples of 2 for whatever reason. The code is compiled in Unicode Character set. Any hints pointing me to the correct direction would be welcome.

FILE* fStream;
errno_t e = _tfopen_s(&fStream, strExportFile, TEXT("wt,ccs=UTF-8"));   //UTF-8

if (e != 0) 
{
    //Error Handling
    return 0;
}

CStdioFile* fileo = new CStdioFile(fStream);
fileo->SeekToBegin();

//Write BOM
unsigned char bom[] = { 0xEF,0xBB,0xBF };
fileo->Write(bom,3);
fileo->Flush();  //BOOM: Assertion failed buffer_size % 2 == 0

Upvotes: 2

Views: 397

Answers (1)

Mark Ransom
Mark Ransom

Reputation: 308530

According to Microsoft's documentation for _tfopen_s (emphasis added):

When a Unicode stream-I/O function operates in text mode (the default), the source or destination stream is assumed to be a sequence of multibyte characters. Therefore, the Unicode stream-input functions convert multibyte characters to wide characters (as if by a call to the mbtowc function). For the same reason, the Unicode stream-output functions convert wide characters to multibyte characters (as if by a call to the wctomb function).

You are expected to write UTF-16 characters to the file, which will then be translated. Instead of the 3-byte sequence 0xEF,0xBB,0xBF you need to write the single 16-bit 0xfeff.

Upvotes: 3

Related Questions