How to read a character not included in ascii in c++?

Question

I'm going through a folder of files editing the titles. I am trying to remove a certain piece of the title but the bracket used to separate in the title is not a standard ascii so I can't figure a way of removing it. This is a sample of the title: 【Remove this portion】keep this portion. I've included the coding I'm using. I'm using a cstring to store the title and then using cstring::find() to locate the portion but it is unable to locate that type of bracket.

    //sets definition
    HANDLE hfind;
    WIN32_FIND_DATA data;

    //creates string for to search for a specific file
    CString FileFormat = FolderPath + Format;
    CString NewTitle, PulledFile;

    //sets definition for retrieving first file
    hfind = FindFirstFile(FileFormat, &data);

    //runs loop if handle is good
    if (hfind != INVALID_HANDLE_VALUE)
    {
    //loops until it hits the end of the folder
    do {
        //adds filename to vector
        PulledFile = data.cFileName;
        if(PulledFile.Find(L'【') != -1)
        {
            while (PulledFile.Find(L'】') != -1)
            {
                PulledFile = PulledFile.Right(PulledFile.GetLength() - 1);
            }
        }
        NewTitle = PulledFile.Left(PulledFile.GetLength()-(Format.GetLength() + 9));
        if (sizeof(NewTitle) != NULL)
        {
            v.push_back(NewTitle);
        }
    } while (FindNextFile(hfind, &data));
    }

meneldal · Accepted Answer

The most likely issue you're facing is that you are not compiling correctly. According to the CString documentation:

A CStringW object contains thewchar_t type and supports Unicode strings. A CStringA object contains the char type, and supports single-byte and multi-byte (MBCS) strings. A CString object supports either the char type or the wchar_t type, depending on whether the MBCS symbol or the UNICODE symbol is defined at compile time.

The actual underlying type depends on your compilation parameters. What is most likely happening is that it's trying to compare a Unicode string with your MBCS string literal value and doesn't return anything.

If you want to fix this you should decide if you want to use Unicode or MBCS and update your compilation parameters accordingly, defining either MBCS or UNICODE.

If you use Unicode, you will have to change your string literal because it currently works for MBCS. You can either use the codepoint L'\u3010' which will return the good character or make sure your file is using a Unicode encoding and use u'【'.

How to read a character not included in ascii in c++?

Answers (2)

Related Questions