Reputation: 1314
I have a xml file and some replace check list for replace xml file. How can escape regex and replace that xml file. Just I tried that concept but It can't work perfectly... how can I do this?
I Tried:
Input xml:
<xml>
<p class="text">The <em type="italic">end</em> of the text</p>
<p class="text">The <bold type="strong">end of the</bold> text</p>
<p class="text">The end of <samll type="caps">the<small> text</p>
</xml>
script:
use strict;
open(IN, "xml_file.xml") || die "can't open $!";
my $text = join '', <IN>;
my @ar = '';
my $testing;
foreach my $t (<DATA>){
@ar = split /\t/, $t;
chomp($ar[0]);
chomp($ar[1]);
$text =~ s/$ar[0]/$ar[1]/segi;
}
print $text;
__END__
<p([^>]+)?> <line>
<small([^>]+)?> <sc$1>
<bold type=\"([^"]+)\"> <strong act=\"$1\">
<(\/)?em([^>]+)?> <$1emhasis$2>
need output:
<xml>
<line>The <emhasis type="italic">end</emhasis> of the text</line>
<line>The <strong act="strong">end of the</strong> text</line>
<line>The end of <sc type="caps">the<sc> text</line>
</xml>
How can I replace this tag regex as checklist and how can I get value from group pattern..
Upvotes: 0
Views: 149
Reputation: 2604
With reference to an old SO post, You need to use double eval substitution.
I can't make it working using <DATA>
, but below code will work. You can make the @replace structure as you want, I just created a simple one.
my $text = <<XML;
<xml>
<p class="text">The <em type="italic">end</em> of the text</p>
<p class="text">The <bold type="strong">end of the</bold> text</p>
<p class="text">The end of <small type="caps">the</small> text</p>
</xml>
XML
my @replace = (
{
'select' => '<p([^>]+)?>',
'replace' => '"<line$1>"'
},
{
'select' => '/p>',
'replace' => '"/line>"'
},
{
'select' => '<small([^>]+)?>',
'replace' => '"<sc$1>"'
},
{
'select' => '/small>',
'replace' => '"/sc>"'
},
{
'select' => '<bold\s+type="(.+?)".*?>',
'replace' => '"<strong act=\"$1\">"'
},
{
'select' => '/bold>',
'replace' => '"/strong>"'
},
{
'select' => '<em([^>]+)?>',
'replace' => '"<emhasis$1>"'
},
{
'select' => '/em>',
'replace' => '"/emhasis>"'
},
);
map {my $re = $_; $text =~ s/$re->{select}/$re->{replace}/sigee;} @replace;
print $text;
Upvotes: 1
Reputation: 13792
Simply add:
$ar[0] = qr/$ar[0]/;
just before execute the regexpr substitution;
also, you forgot this pattern:
</p> </line>
You have a typo in the input xml:
<samll type="caps">
should be
<small type="caps">
And finally, a piece of advice: it's not a good idea parsing XML with regular expressions. I recommend using an XML parser from CPAN, is a better choice (IMO).
Upvotes: 0