Reputation: 1509
Can anyone suggest a way of stripping tab characters ( "\t"s ) from a string? (std::string)
I know that I can do a lot with :
str.erase (std::remove (str.begin(), str.end(), ' '), str.end());
But it takes off all the whitespaces.
For example I want this :
push int32(45)
or __WT__ push int32(45) __WT__
To become this :
push int32(45)
A string with only one whitespace between keywords.
__WT__
= Useless whitespaces or tabs.Thanks in anticipation.
Upvotes: 0
Views: 2995
Reputation: 20025
You can create a template trim function implemented in a similar way with remove_if
#include <string>
#include <iterator>
#include <iostream>
#include <ctype.h>
#include <sstream>
using namespace std;
template <class ForwardIterator, class OutputIterator, class UnaryPredicate>
void trim (
ForwardIterator first, ForwardIterator last, OutputIterator result,
UnaryPredicate pred
) {
while (first != last && pred(*first))
first++;
for (ForwardIterator p = last; first != last; first++) {
if (pred(*first))
p = first;
else {
if (p != last) {
*result = *p;
p = last;
}
*result = *first;
}
}
}
inline bool isJunk(char c) {
return isspace(c);
}
inline string trim_string(string s) {
ostringstream result;
trim(s.begin(), s.end(), ostream_iterator<char>(result, ""), isJunk);
return result.str();
}
int main() {
cout << trim_string(" What the fraaak ") << "." << endl;
}
Output:
What the fraaak.
Upvotes: 1
Reputation: 263390
I can only use C++98, regex are for C++11
Here is a super-efficient in-place solution that does not require any libraries and works in C++98:
template<typename FwdIter>
FwdIter replace_whitespace_by_one_space(FwdIter begin, FwdIter end)
{
FwdIter dst = begin;
IGNORE_LEADING_WHITESPACE:
if (begin == end) return dst;
switch (*begin)
{
case ' ':
case '\t':
++begin;
goto IGNORE_LEADING_WHITESPACE;
}
COPY_NON_WHITESPACE:
if (begin == end) return dst;
switch (*begin)
{
default:
*dst++ = *begin++;
goto COPY_NON_WHITESPACE;
case ' ':
case '\t':
++begin;
// INTENTIONAL FALLTHROUGH
}
LOOK_FOR_NEXT_NON_WHITESPACE:
if (begin == end) return dst;
switch (*begin)
{
case ' ':
case '\t':
++begin;
goto LOOK_FOR_NEXT_NON_WHITESPACE;
default:
*dst++ = ' ';
*dst++ = *begin++;
goto COPY_NON_WHITESPACE;
}
}
Note that goto
s are generally considered to be perfectly acceptable in generated code for finite automata, although in this case, I must admit the code was generated by my brain and fingers ;)
Here is an example of how you might use the proposed solution:
int main()
{
std::string example = "\t\t\tpush \t \t42\t\t\t";
auto new_end = replace_whitespace_by_one_space(example.begin(), example.end());
example.erase(new_end, example.end());
std::cout << "[" << example << "]\n";
}
Upvotes: 2
Reputation: 2953
For those who can't use C++11, here is a simple non-regex solution:
void RemoveWhitespace(std::string *str)
{
// all tabs to spaces
ReplaceString(str, "\t", " ");
// all double spaces to single spaces
while (ReplaceString(str, " ", " ") != 0);
// trim the string
if (!s.empty())
{
if (s.back() == ' ') s.pop_back();
if (s.front() == ' ') s.erase(s.begin());
}
}
Where ReplaceString
may be implemented as
// returns the number of replaced substrings
unsigned int ReplaceString(std::string &str, const std::string &search,
const std::string &replace)
{
unsigned int count = 0;
size_t pos = 0;
while ((pos = str.find(search, pos)) != std::string::npos)
{
str.replace(pos, search.length(), replace);
pos += replace.length();
++count;
}
return count;
}
Upvotes: 0
Reputation: 238491
If you want to replace all consecutive whitespace with a single space, you can do that easily with a trivial regexp. If your compiler supports the current standard, it should have regexp utilities in the standard library, but if you're limited to c++98, you can use an external library instead. Here's a solution using one such library:
test = boost::regex_replace(test, boost::regex("\\s+"), " ");
Upvotes: 0