fati
fati

Reputation: 135

Perl regex replace in C++ (pack and unpack Perl translation into C++)

I have a part of a Perl code:

# In a string, replace all instances of `\A` with `A`,
# where `A` is any non-word character.
# Word character include letter, digits and more.
$exampleString =~ s/\\(\w)/$1/g;

# In a string, replace all instances of `\!` with `@33~`,
# where `!` is any non-whitespace character, and
# `33` is the ordinal value of the Unicode Code Point.
$exampleString =~ s/\\(\S)/$char1. unpack('C*', $1) . $char2/ge;

Then there are some changes applied on the $exampleString.

Then at the end:

# This attempts to reverse the aforementioned use of s///.
$exampleString =~ s/$char1(.*?)$char2/pack('C*', $1)/ge;

$char1 and $char2 are two chars defined by their ASCII value. Imagine char1 is @ and char2 is ~ for example.

This sequence hides \ escapes sequences from the (unshown) intermediary code and restores them.

I want to do the same thing in C++. I was wondering if someone can help with either line by line translation or similar functionality that the code does.

Upvotes: 0

Views: 162

Answers (1)

Håkon Hægland
Håkon Hægland

Reputation: 40778

Here is an example:

#include <iostream>
#include <regex>
#include <string>

std::string compute_ord_value(std::string str, char c1, char c2)
{
    const char c = str.at(0);
    return std::string{ c1 } + std::to_string(int(c)) + std::string{c2};
}

std::string compute_char_value(std::string str)
{
    int ord = std::stoi(str);
    return std::string {R"(\)"} + std::string { (char) ord };
}

void transform_back(std::string str, char c1, char c2)
{
    std::string::const_iterator it = str.cbegin();
    std::string::const_iterator end = str.cend();
    std::string result;
    std::regex re { std::string{c1} + R"((\d+))" + std::string{c2} };
    for (
         std::smatch match;
         std::regex_search(it, end, match, re);
         it = match[0].second
    ) {
        result += match.prefix();
        result += compute_char_value(match.str(1));
    }
    result.append(it, end);
    std::cout << result << '\n';
}

int main() {
    std::string str { R"(xx \A \& \B yy)" };
    std::cout << str << '\n';
    std::regex re { R"(\\(\w))" };
    char c1 = '@';
    char c2 = '~';
    std::string result1 = std::regex_replace(str, re, "$1");

    std::string::const_iterator it = result1.cbegin();
    std::string::const_iterator end = result1.cend();
    std::string result2;
    std::regex re2 { R"(\\(\S))" };
    for (
         std::smatch match;
         std::regex_search(it, end, match, re2);
         it = match[0].second
    ) {
        result2 += match.prefix();
        result2 += compute_ord_value(match.str(1), c1, c2);
    }
    result2.append(it, end);
    std::cout << result2 << '\n';
    transform_back(result2, c1, c2);
    
    return 0;
}

Output:

xx \A \& \B yy
xx A @38~ B yy
xx A \& B yy

Upvotes: 2

Related Questions