Reputation: 567
I have strings that I need to find variables inside of in order to replace them with values. Eg:
my $str = "var1 var2 blah blah blah var3"
Sometimes the strings have embedded strings:
my $str = "var1 var2 blah \"do not replace this: var1\" blah blah var3"
So I built a regex that matches strings and variables. When it matches a string, it replaces it with itself. When it matches variables, it replaces them with the results of a hash. In order to make this work in regex form, I break the captures into two parts, the named group (macro) and the last match. For strings, I capture the first quote character (") into the named group and the rest of the string into the last match. For variables, I capture the whole variable in the named group and I capture nothing in the last capture group. To handle strings, I add a hash entry for {"} = '"'. For each match, I paste the hash lookup followed by the last match. This performs admirably - although seems awkward.
$line =~ s/(?:(?<macro>(?<!\\)")(.*?(?<!\\)")|(?<macro>(``|\b($list_of_hash_keys)\b))())/$variables->{$+{macro}}$+/gs;
Is there a cleaner way in a beautiful regex form?
Upvotes: 1
Views: 111
Reputation: 567
Answer for this is (*SKIP)(*FAIL). What I needed to do was match the string followed by (*SKIP)(*FAIL), and that would dispose of it.
Upvotes: 0
Reputation: 91528
use Modern::Perl;
my @in = (
"var1 var2 blah blah blah var3",
"var1 var2 blah \"do not replace this: var1\" blah blah var3",
);
my $variables = {
var1 => "mod1",
var2 => "mod2",
var3 => "mod3",
var4 => "mod4",
};
my $list_of_hash_keys = '\b(' . join('|',keys(%$variables)) . ')\b';
for (@in) {
s/"[^"]+"(*SKIP)(*FAIL)|$list_of_hash_keys/$variables->{$1}/g;
say
}
Output:
mod1 mod2 blah blah blah mod3
mod1 mod2 blah "do not replace this: var1" blah blah mod3
Explanation:
" # quote
[^"]+ # 1 or more non quote
" # quote
(*SKIP) # skip everything that's been matching (i.e. everything between quotes)
(*FAIL) # fail the match
| # OR
$list_of_hash_keys # list of keys to match, captured in group 1
Upvotes: 0
Reputation: 830
It appears you're trying to implement a mini templating mechanism.... :)
I'm not sure if the following is beautiful, but here's my approach:
my $out = $str =~ s{
(?<str> " [^"]+ " ) |
(?<macro> \b $list_of_hash_keys \b)
}{
$+{str} // $variables->{$+{macro}}
}gsxre;
As you can see, "/e" modifier is used. It is helpful in this case to get rid of the special item '"'
in the $variable
stash.
The ?<str>
captures embedded string, assuming no nested escape sequence inside. I did not test it fully but I don't think this approach is equivlent to yours, nor do I know if it handles all edge cases properly.
But I think this should be enough to demonstrate the idea.
Upvotes: 1