Reputation: 2563

Perl's HTML::Element - dumping just the descendants as HTML

I'm having trouble trying to output the contents of a matched node that I'm parsing:

<div class="description">some text <br/>more text<br/></div>

I'm using HTML::TreeBuilder::XPath to find the node (there's only one div with this class):

my $description = $tree->findnodes('//div[@class="description"]')->[0];

It finds the node (returned as a HTML::Element I believe) but $description->as_HTML includes the element itself too - I just want everything contained inside the element as HTML:

some text <br/>more text<br/>

I can obviously regex strip it out, but that feels messy and I'm sure I'm just missing a function somewhere to do it?

Upvotes: 3

Answers (2)

Jens Erat

Reputation: 38682

Use ./node() to fetch all subnodes including text and elements.

my $description = $tree->findnodes('//div[@class="description"]/node()');

Upvotes: 0

Gilles Quénot

Reputation: 185219

Try doing this :

my $description = $tree->findnodes('//div[@class="description"]/text()')->[0];

This is a Xpath trick.

Upvotes: 0

Perl&#39;s HTML::Element - dumping just the descendants as HTML

Answers (2)

Related Questions

Perl's HTML::Element - dumping just the descendants as HTML