Blaine Mucklow
Blaine Mucklow

Reputation: 591

Perl Regular Expression [] for <>

So I am trying to read an XML file into a string in Perl and send it as part of a SOAP message. I know this is not ideal as there are methods for SOAP sending files, however, I am limited to having to use the SOAP that is set up, and it is not set up for sending with file support.

Therefore I need to parse out the markup tags <> and replace them with []. What is the best way to do this?

Upvotes: 0

Views: 410

Answers (4)

Dave Sherohman
Dave Sherohman

Reputation: 46235

Won't somebody please think of the baby seals?

As others have already pointed out, both in answers and in comments, doing this with a regex will cause problems as soon as your data becomes sufficiently complex to include either [/] or </> characters. Once that happens, any simple regex will break and you'll need to either duct tape it back together in hopes that it'll limp along a bit longer before breaking again or re-implement it with a real XML parser and/or a better SOAP implementation.

OTOH, leonbloy's suggestion of base64 encoding your data is actually a pretty good one. I hadn't thought of that and it should work just as well as a proper SOAP implementation, with the caveats that the sent data will be larger and, if you need to do wire-level debugging, it may be more difficult to interpret the content.

Upvotes: 1

leonbloy
leonbloy

Reputation: 76016

What about using Base64 instead?

Upvotes: 2

Robert Wohlfarth
Robert Wohlfarth

Reputation: 1771

Will something simple like this work for you?

$a=~y/<>/[]/;

y performs a one to one substitution. < becomes [, and > becomes ]. The perlop documentation explains it in greater detail.

Upvotes: 2

Tim Pietzcker
Tim Pietzcker

Reputation: 336478

If simply replacing < by [ and > by ] isn't working for you (perhaps because angle brackets are showing up in CDATA sections somewhere that you don't want replaced), then you probably won't profit much from regexes here. Regular expressions are not suited for matching non-regular languages like XML.

You might get away with searching for <([^>]+)> and replacing that with [$1]:

$subject =~ s/<([^>]+)>/[$1]/g;

Upvotes: 3

Related Questions