Reputation: 405
The current specifications is:
Given string data in the form of wide or narrow character arrays write the functionality for a class that provides statistics on the data and modifies the data.
The requirement is for it to be maintainable over the long term.
So my first approach is to require the raw char arrays be marshalled into strings prior, then just provide a template class:
template<class T>
class MyString
{
private:
T _data;
public:
MyString(T& input) { _data = input; };
size_t doSomeWork() { //assume T is of type basic_string<...> and use iterators };
};
//Use
const char* data = "zyx";
string blahblah(data);
MyString<string> abc(blahblah);
abc.doSomeWork();
or static member functions:
class StringTools
{
public:
static size_t doWork(const char*) {}
static size_t doWork(const wchar_t*) {}
};
//used like so:
const char* data = "hallo kitty";
cout << StringTools::doWork(data);
or use a strategy pattern:
class MyString
{
protected:
MyStringBase();
public:
virtual ~MyStringBase();
virtual size_t doWork() = 0;
};
class MyStringCharArray : public MyString
{
protected:
char* _data;
public:
MyStringCharArray(const char* input) : MyString() { }
virtual size_t doWork() {...};
};
//so it would be used like so
const char* blah = "blah";
MyString* str = new MyStringCharArray(blah);
cout << str->doWork();
delete str;
and then in the future if for some god forsaken reason i switch to BStr's then it would only require that the first two lines of code be changed in addition to a new derived class being written.
I think that if i write a wrapper class as in 1 & 3 it becomes alot more heavy duty and any encapsulation is broken as i'd have to allow access to the underlying.
but if i create a class with only static functions then all it does is mimic a namespace which would be better served by some non-member non-friend functions encapsulated under a "stringtools" namespace. But then i'd still be propagating the messyness of raw character arrays throughout the application and extra validation would have to be performed etc and the specification asked explicitly for a class.
So what would be the cleanest and most maintainable approach to take?
rgds
Upvotes: 4
Views: 303
Reputation: 405
Ok so i've taken on board what you've all said and i've split the algorithms from the specification. Mimicking STL the algorithms work with with iterators; whilst the "MyClass" the specification asks for, encapsulates the domain knowledge.
How does this look to you guys?
first off the algorithm:
/*
MyLib
---------------
Methods:
countOccurances()
- given the iterator pairs for two either contiguous areas of memory or containers
- counts the number of times the second (needle) occurs in the first (haystack)
- returns the count
replaceOccurances()
- same as countOccurances except when a sequence has been matched it is replaced
by the replacement needle which must be the same length
*/
template<class fwdIt>
size_t countOccurances(fwdIt haystackFront, fwdIt haystackEnd,
fwdIt needleFront, fwdIt needleEnd)
{
size_t lengthOfNeedle = std::distance(needleFront,needleEnd);
size_t lengthOfHaystack = std::distance(haystackFront,haystackEnd);
size_t count = 0;
while(true)
{
//find the needle
fwdIt tempIT1 = haystackFront, tempIT2 = needleFront;
while(true)
{
if(tempIT2 == needleEnd)
{
haystackFront += lengthOfNeedle;
lengthOfHaystack -= lengthOfNeedle;
count++;
break;
}
else if(*tempIT1 != *tempIT2)
{
break;
}
tempIT1++; tempIT2++;
}
if(lengthOfNeedle <= lengthOfHaystack)
{
++haystackFront;
--lengthOfHaystack;
}
else
{
break;
}
}
return count;
}
template<class fwdIt>
size_t replaceOccurances(fwdIt haystackFront, fwdIt haystackEnd,
fwdIt needleFront, fwdIt needleEnd,
fwdIt replacementFront, fwdIt replacementEnd)
{
//The needle and its replacement must be the same length,
//this method cannot be reponsible for growing a container it doesn't own.
if(std::distance(needleFront, needleEnd) != std::distance(replacementFront, replacementEnd))
throw exception("The needle and its replacement are not the same length");
size_t lengthOfNeedle = std::distance(needleFront,needleEnd);
size_t lengthOfHaystack = std::distance(haystackFront,haystackEnd);
size_t count = 0;
while(true)
{
//find the needle
fwdIt tempIT1 = haystackFront, tempIT2 = needleFront;
while(true)
{
if(tempIT2 == needleEnd)
{
//replace the needle
for(fwdIt tempIT3 = replacementFront;
haystackFront != tempIT1, tempIT3 != replacementEnd;
haystackFront++, tempIT3++)
{
*haystackFront = *tempIT3;
}
count++;
break;
}
else if(*tempIT1 != *tempIT2)
{
break;
}
tempIT1++; tempIT2++;
}
if(lengthOfNeedle <= lengthOfHaystack)
{
++haystackFront;
--lengthOfHaystack;
}
else
{
break;
}
}
return count;
}
and now MyClass
class MyClass
{
public:
static size_t getMyCount(std::string& sInput);
static size_t getMyCount(std::wstring& sInput);
static size_t replaceMyWithMY(std::string& sInput);
static size_t replaceMyWithMY(std::wstring& sInput);
protected:
static std::string _narrowNeedle;
static std::wstring _wideNeedle;
static std::string _narrowReplacementNeedle;
static std::wstring _wideReplacementNeedle;
template<class T>
static size_t _PerformStringOperation(T& sInput, T& sNeedle, bool replace = false, T& sReplacementNeedle = T())
{
try
{
if(replace)
{
return replaceOccurances( sInput.begin(), sInput.end(),
sNeedle.begin(), sNeedle.end(),
sReplacementNeedle.begin(), sReplacementNeedle.end());
}
else
{
return countOccurances( sInput.begin(), sInput.end(),
sNeedle.begin(), sNeedle.end());
}
}
catch(MYException& e)
{
clog << "MyClass::_PerformStringOperation() - could not perform operation" << endl;
clog << e.what();
throw;
}
catch(exception& e)
{
clog << "MyClass::_PerformStringOperation() - Something more fundemental went wrong" << endl;
clog << e.what();
throw;
}
}
};
and the accompanying CPP
std::string MyClass::_narrowNeedle("My");
std::wstring MyClass::_wideNeedle = std::wstring(L"My");
std::string MyClass::_narrowReplacementNeedle = std::string("MY");
std::wstring MyClass::_wideReplacementNeedle = std::wstring(L"MY");
size_t MyClass::getNiCount(std::string& sInput)
{
try
{
return _PerformStringOperation(sInput,_narrowNeedle);
}
catch(...)
{
throw;
}
}
size_t MyClass::getNiCount(std::wstring& sInput)
{
try
{
return _PerformStringOperation(sInput,_wideNeedle);
}
catch(...)
{
throw;
}
}
size_t MyClass::replaceNiWith(std::string& sInput)
{
try
{
return _PerformStringOperation(sInput,_narrowNeedle,true,_narrowReplacementNeedle);
}
catch(...)
{
throw;
}
}
size_t MyClass::replaceNiWith(std::wstring& sInput)
{
try
{
return _PerformStringOperation(sInput,_wideNeedle,true,_wideReplacementNeedle);
}
catch(...)
{
throw;
}
}
Upvotes: 0
Reputation: 24160
The problem is underspecified, and you're overdesigning this.
What kind of statistics will be gathered? What kinds of changes will be made to the strings? How long will the strings be? Does performance matter?
Why not go with a simple solution: Write all of your statistics / string changing routines to work on UTF-8 (assuming this is the encoding for your char
s). If you need to work on a UTF-16 string, convert it to UTF-8, call the routine that works on it, then convert the modified string back to UTF-16. KISS.
Edit:
There is another consideration: Is your algorithm encoding-agnostic (i.e. is it independent of the encoding of the strings), i.e. is the only variable the "wideness" of the characters? If this is the case, a templatized routine that takes a beginning and end iterator, as parapura suggests, may well be an option.
Based on your clarification, it sounds as if your algorithm currently is encoding agnostic -- but since you mention maintainability, you should consider whether this will also be true in the future
Upvotes: 0
Reputation: 490338
It sounds to me like instead of a class, you should be thinking in terms of generic algorithms that operate on perfectly normal string
/wstring
data. Depending on what you end up needing in the way of statistics/modification, you might not even need to write actual algorithms, but just functors to be applied with (for example) std::accumulate
.
If you end up needing to work with something like a BSTR, you'll need to provide an iterator interface to a BSTR. Personally, however, even when I'm dealing with COM, I tend to actually work with normal strings nearly all the time, and convert to a BSTR only immediately before passing it to some COM thing. Likewise on the return trip, as soon as I receive a BSTR, I convert it to a normal string, and work with it in that form. In theory it might be faster to leave things as BSTRs if you're working with COM quite a bit, but I've yet to see conversion to/from normal strings turn into anything approaching a bottleneck.
Upvotes: 1
Reputation: 24413
The best approach would be to do something like what stl algorithms do.
Have a process algorithm that only takes the string char/wchart_t begin and end iterator. This way you algorithm will seamlessly work for all strings that can contiguous in memory.
Upvotes: 4