Carol.Kar
Carol.Kar

Reputation: 5345

Use xpath in importhtml

I am trying to get from product hunt the upvotes into a google spreadsheet.

I tried the following:

=IMPORTHTML(A2,"//section[@class='post-vote-button--count']/text()")

A2 is in my case https://www.producthunt.com/tech/i-brew-my-own-coffee.

Any suggestions what I am doing wrong?

Upvotes: 0

Views: 2743

Answers (2)

Mathias Müller
Mathias Müller

Reputation: 22617

IMPORTHTML only allows you to find either lists or tables. IMPORTXML would allow you to use a general XPath expression.

The IMPORTXML function is actually intended for reading in XML data, not HTML. But if the HTML is XHTML or reasonable HTML, you can use IMPORTXML to import HTML data and then apply an XPath expression to it.

In this case, the HTML is not well-formed enough to be parsed by the IMPORTXML function and IMPORTHTML does not allow you to find the content you are interested in.

So, as far as I know: there is no way to achieve what you want - except writing custom code for Google Sheets.


EDIT: The comments above are an incorrect analysis of the OP's specific problem, as pointed out by Aurielle Perlmann. In fact, this formula

=IMPORTXML("https://www.producthunt.com/tech/i-brew-my-own-coffee","//span[@class='post-vote-button--count']")

or the one of Aurielle will work. In the HTML source, "166" - the wanted result - is not inside a section element, but a span element. Here is a snippet of the page source:

<span class="post-vote-button--count" 
      data-reactid=".2dbdvned0xs.0.0.0.2.1.1">166</span>

Additionally, it seems that using text() in an XPath expression in Google Sheets is buggy.

Upvotes: 2

Aurielle Perlmann
Aurielle Perlmann

Reputation: 5509

You can pull it with this xpath:

=IMPORTXML("https://www.producthunt.com/tech/i-brew-my-own-coffee","//*[@class='post-vote-button--count']")

Upvotes: 2

Related Questions