Reputation: 41
I'm importing into Google Sheets with IMPORTXML with the following XPATH:
=IMPORTXML(A2;"//*[@id='mw-content-text']/div/table[1]/tbody/tr[4]/td[1]/ul/li")
A2 containing the URL (https://stt.wiki/wiki/20th_Century_Pistol).
From the website I want to import the list entries in the "Basic" column and "Crafted From" row of the table.
There are only two list entries in this section of the table:
Therefore, I expected to get only those two list entries as rows in my sheet. Instead, I got an additional blank row above those two entries. When I change "td[1]" to "td[3]" in the XPATH query however, there are no extra blanks.
I don't understand where the additional blank row is coming from and how I can avoid it.
Google Sheet with desired and actual result
Upvotes: 0
Views: 1060
Reputation: 201338
When I saw the HTML of the URL, there are 2 li
tags in the ul
tag. So I think that your xpath is correct. But from your issue, I was worry that the sup
tag might affect to this situation. But I'm not sure whether this is the direct reason. So I would like to propose to add the attribute of li
for your xpath as follows.
When your xpath is modified, please modify as follows.
From://*[@id='mw-content-text']/div/table[1]/tbody/tr[4]/td[1]/ul/li
To:
//*[@id='mw-content-text']/div/table[1]/tbody/tr[4]/td[1]/ul/li[@style='white-space:nowrap']
[@style='white-space:nowrap']
, the value of li
with style='white-space:nowrap'
is retrieved.The formula is =IMPORTXML(A1;"//*[@id='mw-content-text']/div/table[1]/tbody/tr[4]/td[1]/ul/li[@style='white-space:nowrap']")
. Please put the URL to the cell "A1".
//*[@id='mw-content-text']/div/table[1]/tbody/tr[4]/td[1]/ul/li[position()>1]
.Upvotes: 2
Reputation: 5905
To complete the very neat @Tanaike's answer, another expression :
=IMPORTXML(A2;"//th[contains(.,'Crafted')]/following::td[1]//li[contains(@style,'white')]")
If a blank line is added it's because GoogleSheets parses an additional blank li
element containing a @style
attribute.
Upvotes: 1