tread
tread

Reputation: 11098

WWW::Mechanize::Firefox How do you extract the text within HTML element tags?

Good Day,

How do you print the text of an HTML tag with WWW::Mechanize::Firefox?

I have tried:

    print $_->text, '/n' for $mech->selector('td.dataCell');

    print $_->text(), '/n' for $mech->selector('td.dataCell');


    print $_->{text}, '/n' for $mech->selector('td.dataCell');

    print $_->content, '/n' for $mech->selector('td.dataCell');

Remember I do not want {innerhtml}, but that does work btw.

print $_->{text}, '/n' for $mech->selector('td.dataCell');

The above line does work, but output is just multiple /n

Upvotes: 1

Views: 1511

Answers (4)

CJ7
CJ7

Reputation: 23275

Either:

$element->{textContent};

or

$element->{innerText};

will work.

Upvotes: 0

Andy Post
Andy Post

Reputation: 33

my $node = $mech->xpath('//td[@class="dataCell"]/text()');

print $node->{nodeValue};

Note that if you're retrieving text interspersed with other tags, like "Test_1" and "Test_3" in this example...

<html>
  <body>
    <form name="input" action="demo_form_action.asp" method="get">
      <input name="testRadioButton" value="test 1" type="radio">Test_1<br>
      <input name="testRadioButton" value="test 3" type="radio">Test_3<br>
      <input value="Submit" type="submit">
    </form>
  </body>
</html>

You need to refer to them by their position within the tag (taking any newlines into account):

$node = $self->{mech}->xpath("//form/text()[2]", single=>1);

print $node->{nodeValue};

Which prints "Test_1".

Upvotes: 3

tread
tread

Reputation: 11098

The only solution I have is to use:

my $element = $mech->selector('td.dataCell');

my $string = $element->{innerHTML};

And then formatting the html within each dataCell

Upvotes: 1

Gilles Qu&#233;not
Gilles Qu&#233;not

Reputation: 185254

I would do :

print $mech->xpath('//td[@class="dataCell"]/text()');

using a expression

Upvotes: 1

Related Questions