Perl regular expression

Question

J.M.Astilleros

This is a single line. I just need to extract the word between the tags and which in this case is Astilleros. Is there a regex to do this. The problem I am facing is that there is no space between each word and the end tag where '/' is a character in perl regex.. please help..

The idea is to get the names out, find them in the text on the page and put Astilleros tags around them.. I will definitely try XML parsers..

amon · Accepted Answer

Don't parse XML with regexes – it is just too damn hard to get right. There are good parsers lying around, just waiting to be utilized by you. Let's use XML::LibXML:

use strict; use warnings;
use XML::LibXML;

my $dom = XML::LibXML->load_xml(string => <<'END');

  
    
      
        J.M.
        Astilleros
      
    
  

END

# use XPath to find your element
my ($name) = $dom->findnodes('//given-names');
print $name->textContent, "
";

(whatever you try, do not use XML::Simple!)

Perl regular expression

Answers (2)

Related Questions