Reputation: 11037
I have text input similar to that shown below. I'd like to add the word auto
before each 'a=b' pattern, but only if it is part of a sequence following the keyword kywrd
(separated by semicolons).
kywrd a=b;c=d;
e=f;
fnctn z;
g=h;
So the output I'm looking for here is:
kywrd2 auto a=b;auto c=d;
auto e=f;
fnctn z;
g=h;
The Perl6 (Raku?) code below uses a regular expression to add the auto
keyword, but only before the first a=b
pattern. Is there a simple way to perform the substitution for all patterns in the sequence; leaving g=h;
unmodified?
my Str $x = slurp "in.q";
$x ~~ s:g /kywrd\s+(\w+)\=(\w+)\;/kywrd2 auto $0=$1\;/;
spurt "out.q", $x;
Upvotes: 7
Views: 332
Reputation: 32414
One way:
# Create a separate named regex that captures an `x=y;` pair:
my regex pair { (\w+) \= (\w+) \; (\s*) }
# (Capture `(\s*)` so formatting between pairs is retained)
# Generate and return 'auto'-ized replacement of a captured pair:
sub auto-ize ($/) { "auto $0=$1;$2" }
$x ~~ s:g { kywrd \s+ <pair>+ } = "kywrd2 $<pair>».&auto-ize.join()";
All the code I've shown would be simple to understand for someone a little familiar with Raku but I'll explain it anyway.
I've broken out a named regex to match a pair. (See my answer to Difference in capturing and non-capturing regex scope in Perl 6 / Raku for details about why/how <pair>
calls the pair
regex.)
The auto-ize
sub routine uses the match variable ($/
) as its argument. This is convenient because $0
etc. are then automatically aliased to the numbered captures associated with the passed match object.
I've used syntax of the form s [ ... ] = " ... "
because I think it's more readable for this use case. (See mention of "different delimiters" in s///
doc.)
The "kywrd2 ..." string will be repeatedly evaluated and become a replacement of a match, once for each match of the multiple s:g
matches.
The $<pair>».&auto-ize.join()
bit is code being interpolated under double quoted string rules.
$<pair>
is short for $/<pair>
, i.e. the <pair>
key of $/
. It refers to the pair
named capture associated with the match variable. The latter will correspond to each match of the multiple s:g
matches in turn.
The +
quantifier in the regex expression <pair>+
means that, if it matches, it produces a List
of capture (match) objects rather than just one (as would be the case if the expression was instead just <pair>
or <pair>?
).
»
treats its LHS operand as a tree or list (in this case a list of one or more capture/match objects, one per foo=bar;...
pair) and walks over its elements. For each "leaf" element the »
does the operation on its right. (»
is a powerful operator but has nice simple use cases such as this one where it's just a notationally convenient and compact equivalent of a for
loop. You can write it as >>
if you prefer ASCII.)
.&auto-ize
calls the auto-ize
sub routine as if it were a method, using the operand to its left as the first argument.
The test input data from @PolarBear's answer:
kywrd a=b;c=d;
e=f;
fnctn z;
g=h;
k=m;
fnctn y;
kywrd m=n;
k=j;
kywrd z=a;b=i;
kywrd c=x;e=i;
z=q;
fnctn o;
Putting that into in.q
and say
ing the resulting out.q
displays:
kywrd2 auto a=b;auto c=d;
auto e=f;
fnctn z;
g=h;
k=m;
fnctn y;
kywrd2 auto m=n;
auto k=j;
kywrd2 auto z=a;auto b=i;
kywrd2 auto c=x;auto e=i;
auto z=q;
fnctn o;
Upvotes: 5
Reputation: 6798
Not very elegant but workable code (ancient way)
#!/usr/bin/perl
use strict;
use warnings;
OUTER: while(<DATA>) {
if( s/kywrd /kywrd2 / ) {
do {
if( ! s/(\w+)=(\w+)/auto $1=$2/g ) {
print;
next OUTER;
}
print;
} while ( <DATA> );
} else {
print;
}
}
__DATA__
kywrd a=b;c=d;
e=f;
fnctn z;
g=h;
k=m;
fnctn y;
kywrd m=n;
k=j;
kywrd z=a;b=i;
kywrd c=x;e=i;
z=q;
fnctn o;
I need to look at Raku - what kind of animal it is.
Upvotes: 3
Reputation: 5072
One possible way that keeps the regexing to a minimum:
sub repl ($input)
{
$input.Str
.split(';', :skip-empty)
.map( 'auto ' ~ * ~ ';')
.join('')
};
my $foo = 'kywrd a=b;c=d;d=e;';
$foo ~~ s:g /kywrd \s+ (.+)/kywrds2 { repl($0) }/;
$foo.say;
Personally I'd prefer the method form subst
over the s//
operator though.
$foo .= subst(/ kywrd \s+ (.+) /, "kywrds2 { repl($0) }", :g);
Upvotes: 5