Paramore
Paramore

Reputation: 1323

C++ - read ANSI encoded text as wide character string

My aim is to read ANSI encoded texts. But for some reasons I'm using fgetws() (not fgets) function and of course file is opened in binary mode. Here is the short code which demonstrates my problem

  bool testfunc(wchar_t path[])
  {
     wchar_t buffer[10];

     if( FILE * fr=_wfopen(path,L"rb") )
     {
        fgetws(buffer,sizeof(buffer),fr);
        fclose(fr);
        return true;
     }
     else return false;
  }

when I call this function and pass ANSI encoded text file path as an argument, there arises Access Violation error in run time. It seems that the error occurs when the text size is large enough. I can not figure out where the problem is.

Upvotes: 0

Views: 1873

Answers (2)

If the file contains only ASCII characters (remember that ASCII is a subset of Unicode, and that the size of wchar_t is implementation specific and might fit for some fixed-width encoding of a subset of Unicode characters; so wchar_t is not very portable) you need to convert each individual ASCII character to its wide character equivalent:

{
#define SIZE 80
    char cbuf[SIZE];
    wchar_t wbuf[SIZE];
    char* pc;
    wchar_t* pw;
    memset (cbuf, 0, sizeof(cbuf));
    memset (wbuf, 0, sizeof(wbuf));
    fgets (cbuf, SIZE, fr);
    for ((pc=cbuf), (pw=wbuf); pc<cbuf+SIZE && *pc != 0; pc++, pw++)
      *pw = (wchar_t) *pc;
}

P.S. read carefully the NOTES of fgetws(3) man page. It can be understood as scary.

Upvotes: 1

rodrigo
rodrigo

Reputation: 98526

According to the doc, fgetws 's second parameter is the number of wide chars in the array, not bytes, so:

fgetws(buffer,sizeof(buffer)/sizeof(*buffer),fr);

A useful classic macro for this is is:

#define countof(x) (sizeof(x)/sizeof(*(x)))

Or a fancy C++ template:

template <typename T, int N>
int countof(T (&a)[N])
{
    return N;
}

Upvotes: 2

Related Questions