cbrng
cbrng

Reputation: 63

How to erase non-alpha chars and lowercase the alpha chars in a single pass of a string?

Given a string:

std::string str{"This i_s A stRIng"};

Is it possible to transform it to lowercase and remove all non-alpha characters in a single pass?

Expected result:

this is a string

I know you can use std::transform(..., ::tolower) and string.erase(remove_if()) combination to make two passes, or it can be done manually by iterating each character, but is there a way to do something that would combine the std::transform and erase calls without having to run through the string multiple times?

Upvotes: 3

Views: 296

Answers (3)

Drew Dormann
Drew Dormann

Reputation: 63912

C++20 ranges allow algorithms to be combined, in a one-pass fashion.

std::string str{"This i_s A stRIng"};
std::string out;

auto is_alpha_or_space = [](unsigned char c){ return isalpha(c) || isspace(c); };
auto safe_tolower = [](unsigned char c){ return tolower(c); };

std::ranges::copy( str
    | std::views::filter(is_alpha_or_space)
    | std::views::transform(safe_tolower)
    , std::back_inserter(out));

See it on Compiler Explorer

Upvotes: 3

francesco
francesco

Reputation: 7549

First let me notice that you seem to want to filter alphabetic characters or spaces, that is, characters l for which std::isalpha(l) || std::isspace(l) returns true.

Assuming this, you can achieve what you want using std::accumulate

str = std::accumulate(str.begin(), str.end(), std::string{},
    [](const std::string& s, const auto& l) {
        if (std::isalpha(l, std::locale()) || std::isspace(l, std::locale()))
             return s + std::tolower(l, std::locale());
        else
             return s;
    });

See it Live on Coliru.

Upvotes: 2

Remy Lebeau
Remy Lebeau

Reputation: 597325

I know you can ... do it manually by iterating each character ...

That is exactly would you would have to do, eg:

std::string str{"This i_s A stRIng"};

std::string::size_type pos = 0;
while (pos < str.size())
{
    unsigned char ch = str[pos];
    if (!::isalpha(ch))
    {
        if (!::isspace(ch))
        {
            str.erase(pos, 1);
            continue;
        }
    }
    else
    {
        str[pos] = (char) ::tolower(ch);
    }
    ++pos;
}

Or:

std::string str{"This i_s A stRIng"};

auto iter = str.begin();
while (iter != str.end())
{
    unsigned char ch = *iter;
    if (!::isalpha(ch))
    {
        if (!::isspace(ch))
        {
            iter = str.erase(iter);
            continue;
        }
    }
    else
    {
        *iter = (char) ::tolower(ch);
    }
    ++iter;
}

but is there a way to do something that would combine the std::transform and erase calls without having to run through the string multiple times?

You can use the standard std::accumulate() algorithm, as shown in francesco's answer. Although, that will not manipulate the std::string in-place, as the code above does. It will create a new std::string instead (and will do so on each iteration, for that matter).

Otherwise, you could use C++20 ranges, ie by combining std::views::filter() with std::views::transform(), eg (I'm not familiar with the <ranges> library, so this syntax might need tweaking):

#include <ranges>

auto alphaOrSpace = [](unsigned char ch){ return ::isalpha(ch) || ::isspace(ch); }
auto lowercase = [](unsigned char ch){ return ::tolower(ch); };

std::string str{"This i_s A stRIng"};

str = str | std::views::filter(alphaOrSpace) | std::views::transform(lowercase);

But, this would actually be a multi-pass solution, just coded into a single operation.

Upvotes: 1

Related Questions