Reputation: 127
For the sake of learning, I'm trying to implement my own simple Tokenize function with CStrings. I currently have this file:
11111
22222
(ENDWPT)
222222
333333
(ENDWPT)
6060606
ggggggg
hhhhhhh
(ENDWPT)
iiiiiii
jjjjjjj
kkkkkkk
lllllll
mmmmmmm
nnnnnnn
Which I would like to be tokenized with the delimiter (ENDWPT). I coded the following function, which attempts to find the delimiter position, then add the delimiter length and extract the text to this position. After that, update a counter that is used so that the next time the function is called it begins searching for the delimiter from the previous index. The function looks like this:
bool MyTokenize(CString strText, CString& strOut, int& iCount)
{
CString strDelimiter = L"(ENDWPT)";
int iIndex = strText.Find(strDelimiter, iCount);
if (iIndex != -1)
{
iIndex += strDelimiter.GetLength();
strOut = strText.Mid(iCount, iIndex);
iCount = iIndex;
return true;
}
return false;
}
And is being called like so:
int nCount = 0;
while ((MyTokenize(strText, strToken, nCount)) == true)
{
// Handle tokenized strings here
}
Right now, the function is splitting the strings in the wrong way, I think it is because Find()
may be returning the wrong index. I think it should be returning 12, but it is actually returning 14??.
I ran out of ideas, if anyone can figure this out I would really appreciate it.
Upvotes: 2
Views: 918
Reputation: 31629
If delimiter is found (iIndex
) then read iIndex - iCount
count, starting from (iCount
). Then modify iCount
if(iIndex != -1)
{
strOut = strText.Mid(iCount, iIndex - iCount);
iCount = iIndex + strDelimiter.GetLength();
return true;
}
The source string may not end with delimiter, it needs a special case for that.
You can also pick better names to match the usage for CString::Mid(int nFirst, int nCount)
to make it easier to understand. MFC uses camelCase
coding style, with type identifiers in front of variables, which is unnecessary in C++, I'll avoid it in this example:
bool MyTokenize(CString &source, CString& token, int& first)
{
CString delimeter = L"(ENDWPT)";
int end = source.Find(delimeter, first);
if(end != -1)
{
int count = end - first;
token = source.Mid(first, count);
first = end + delimeter.GetLength();
return true;
}
else
{
int count = source.GetLength() - first;
if(count <= 0)
return false;
token = source.Mid(first, count);
first = source.GetLength();
return true;
}
}
...
int first = 0;
CString source = ...
CString token;
while(MyTokenize(source, token, first))
{
// Handle tokenized strings here
}
Upvotes: 2