Spy
Spy

Reputation: 39

Replace characters with entities within XML attributes

I want to escape double quotes inside a XML element. For instance

FROM

<person name="Tiberius Claudius "Maximus"" sex="M">

TO

<person name="Tiberius Claudius &quot;Maximus&quot;" sex="M">

I was able to isolate the attribute value using sed:

$ cat sample.xml | sed -r 's/(<person name=")(.*)(" sex.*)/\2/'
  Tiberius Claudius "Maximus"

Is there a way to replace double quotes " with &quot; within the second group?

Upvotes: 1

Views: 274

Answers (2)

ikegami
ikegami

Reputation: 385754

perl -i~ -pe's{<person name="\K(.*?)(?=" sex)}{ $1 =~ s/"/&quot;/gr }eg' sample.xml

Or if you don't have 5.14,

perl -i~ -pe's{<person name="\K(.*?)(?=" sex)}{ ( my $s = $1 ) =~ s/"/&quot;/g; $s }eg' sample.xml

Upvotes: 1

revo
revo

Reputation: 48711

Using perl you are able to do a find and replace like so:

Find:

(?<!=)(")(?![^"]*\s+\w+=|[^"]*\/?>)

Replace with:

&quot;

Live demo

Upvotes: 0

Related Questions