Reputation: 41
I'm going through a folder of files editing the titles. I am trying to remove a certain piece of the title but the bracket used to separate in the title is not a standard ascii so I can't figure a way of removing it. This is a sample of the title: 【Remove this portion】keep this portion. I've included the coding I'm using. I'm using a cstring to store the title and then using cstring::find() to locate the portion but it is unable to locate that type of bracket.
//sets definition
HANDLE hfind;
WIN32_FIND_DATA data;
//creates string for to search for a specific file
CString FileFormat = FolderPath + Format;
CString NewTitle, PulledFile;
//sets definition for retrieving first file
hfind = FindFirstFile(FileFormat, &data);
//runs loop if handle is good
if (hfind != INVALID_HANDLE_VALUE)
{
//loops until it hits the end of the folder
do {
//adds filename to vector
PulledFile = data.cFileName;
if(PulledFile.Find(L'【') != -1)
{
while (PulledFile.Find(L'】') != -1)
{
PulledFile = PulledFile.Right(PulledFile.GetLength() - 1);
}
}
NewTitle = PulledFile.Left(PulledFile.GetLength()-(Format.GetLength() + 9));
if (sizeof(NewTitle) != NULL)
{
v.push_back(NewTitle);
}
} while (FindNextFile(hfind, &data));
}
Upvotes: 3
Views: 181
Reputation: 1747
The most likely issue you're facing is that you are not compiling correctly. According to the CString documentation:
A
CStringW
object contains thewchar_t
type and supports Unicode strings. ACStringA
object contains thechar
type, and supports single-byte and multi-byte (MBCS
) strings. ACString
object supports either the char type or thewchar_t
type, depending on whether theMBCS
symbol or theUNICODE
symbol is defined at compile time.
The actual underlying type depends on your compilation parameters. What is most likely happening is that it's trying to compare a Unicode string with your MBCS string literal value and doesn't return anything.
If you want to fix this you should decide if you want to use Unicode or MBCS and update your compilation parameters accordingly, defining either MBCS
or UNICODE
.
If you use Unicode, you will have to change your string literal because it currently works for MBCS. You can either use the codepoint L'\u3010'
which will return the good character or make sure your file is using a Unicode encoding and use u'【'
.
Upvotes: 2
Reputation: 104569
Most likely your editor isn't properly encoding the hardcoded 【 and 】 as the unicode chars you seek. Visual Studio sometimes gets this right with auto-encoding the source file as UTF8, but that's not always reliable and may not survive a source control system that expects ascii.
Easiest thing to do is use the \uNNNN syntax to match the chars.
if(PulledFile.Find(L'\u3010') != -1)
{
while (PulledFile.Find(L'\u3011') != -1)
{
PulledFile = PulledFile.Right(PulledFile.GetLength() - 1);
}
}
Where \u3010
and \u3011
are the hex escape sequences for the unicode values of【 and 】respectively.
Upvotes: 2