Reputation: 26
I am trying to replace the regex using perl. I have used sed in perl but however, it doesn't seem to work.
'fairness' and 'efficiency’
I need to replace 'efficiency’ as ‘efficiency’
I tried the below code,
system "sed -e 's/\&\#x0027\;\([a-zA-Z0-9 _]*\)\&\#x2019\;/tooch&/g' trans.xml > tmp.xml";
system "sed -e 's/tooch\&\#x0027\;/\&\#x2018\;/g' tmp.xml > trans.xml"
The above sed commands works manually but not from inside the Perl.
Any help would be greatly appreciated !!
Upvotes: 0
Views: 158
Reputation: 57354
A few serious problems:
Why are you calling sed
? Sure, maybe IO is harder to do in perl
, but perl
has regexp's inbuilt.
use Path::Tiny qw(path);
my $content = path('trans.xml')->slurp;
$content =~ s/bar/baz/g;
$content =~ s/foo/bar/g;
path('trans.xml')->spew( $content );
note: If trans.xml
is UTF-8
encoded, all you have to do here is replace slurp
/spew
with slurp_utf8
/spew_utf8
. VS sed
, which may be ignorant of unicode.
system
with a string should be avoided where possible, because of many reasons, one is the problem you've experienced: Quoting is hard.
system('sed', '-e', $regexp )
Is preferred syntax where ever possible. Note you can't use this in conjunction with redirection, but you really don't need to.
multiple calls to sed not needed:
sed 's/foo/bar/g;s/bar/baz/g'
this will apply both expressions.
Once #3 is realised, the temporary file is not required:
sed -i 's/foo/bar/g;s/bar/baz/g' $file
this will modify $file
IN PLACE
When using system
, you probably want to check the return value.
Upvotes: 0
Reputation: 42094
You're a victim of the double quotes.
Replacing your system
call with say
will show you more clearly what's going on:
sed -e 's/'([a-zA-Z0-9 _]*)’/tooch&/g' trans.xml > tmp.xml
sed -e 's/tooch'/‘/g' tmp.xml > trans.xml
See what's wrong? There are no backslashes left. They've been interpreted by the Perl double quotes, and are not there for sed
to use.
Your case is a bit tricky to correct, since you already use (and need) the single quotes to pass to sed
. You could theoretically escape what's needed one more time, but that's error-prone. It's much better to use Perl's other single-quoting facilities:
system q+sed -e 's/\&\#x0027\;\([a-zA-Z0-9 _]*\)\&\#x2019\;/tooch&/g' trans.xml > tmp.xml+;
system q(sed -e 's/tooch\&\#x0027\;/\&\#x2018\;/g' tmp.xml > trans.xml);
I used +
as a separator on the first line because it happened not to be used in the string itself. I used plain parentheses in the second line because they were 100% unambiguous there.
Upvotes: 1