Reputation: 13
I'm using Google Sheet IMPORTHTML() to get the text from a website. I am trying to get the text after a certain search string via IMPORTHTML() in Google Sheet using XPath.
As an example, I would like to extract number of rooms and living space from the websites below https://www.comparis.ch/immobilien/details/favorite?id=26720381
https://www.comparis.ch/immobilien/details/favorite?id=26714976
when i extract the xpath via chrome, it unfortunately isn't always exactly the same for each url. That's why I wanted to try with searching a string:
//*[@id="__next"]/div/div[1]/div[1]/div[4]/div[1]/div[3]/div[1]/div/div/div[2]/div/div[1]/p
//*[@id="__next"]/div/div[1]/div[1]/div[4]/div[1]/div[3]/div[1]/div/div/div[2]/div/div[1]/p/span
Plan is therefore to have in column A a list of urls, in row 1 the respective xpath (incl. search string like "room" or "living space" and a table where =importxml($A2,B$1) would use these urls and xpaths to search for the respective values.
I was trying to get the following-sibling after the place where it found for example the string "room" or "living space"
Help is much appreciated as I am new and lost at the moment. Best, Ueli
Upvotes: 1
Views: 664
Reputation: 201603
How about modifying your xpath as follows?
//p[@class='css-1ush3w6 excbu0j2' and ../..//span[text()='Zimmer']]
When this modified xpath is used for your 2 URLs, the following result is obtained.
In this sample, the URL and =IMPORTXML(A1, "//p[@class='css-1ush3w6 excbu0j2' and ../..//span[text()='Zimmer']]")
is put to the cells "A1" and "B1", respectively.
Upvotes: 0
Reputation: 1
try:
=VLOOKUP("Zimmer", IMPORTXML(A1, "//div[@class='css-1wfw5hl excbu0j4']"), 2, 0)
and drag down:
Upvotes: 0