Reputation: 11
i would like to return the position number of the first tag that contains a choosen text in a set of tags.
I want to set up a google sheet with 4 columns :
1 : the URL (https://www.castorama.fr/electricite/interrupteur-et-prise/interrupteur-et-prise-etanche/cat_id_4414.cat)
2 : the number of products on the category page :
=IMPORTXML($B4;"count(//p[@data-test-id='productTitle'])")
3 : The number of products which its title contains "Legrand" or "Céliane with Netatmo" :
=IMPORTXML($B4;"count(//p[@data-test-id='productTitle'][contains(.,'Céliane with Netatmo')or contains(.,'Legrand')])")
The above formulas are good. I'm having a hard time on the fourth one :
4 : The position of the first product which its title contains "Legrand" or "Céliane with Netatmo".
For example, column 2 returns 24 (products), and column 3 returns 12 (products). If the first product is a product with "Legrand" or "Céliane with Netatmo", i want the column 4 to return 1. If it is in 4th position, i want it to return 4.
i tried
=IMPORTXML($B4;"//div//p[@data-test-id='productTitle'][.='Legrand']/preceding-subling::*)+1")
but it seems that i don't have the good order for the "//div//p". i bet it have to do with the subling elements, before the "/" , i must not have the correct selected tags.
Upvotes: 0
Views: 67
Reputation: 5915
Another option with the following XPath (we use ()
[]
and preceding
axis) :
=IMPORTXML(B4;"count((//p[@data-test-id='productTitle'][contains(.,'Céliane with Netatmo')or contains(.,'Legrand')])[1]/preceding::div[@data-test-id='product-panel'])+1")
Output :
For safety reason (if there's no Legrand product on the page), formula in E4 should be :
=IF(D4=0;"NA";IMPORTXML(B4;"count((//p[@data-test-id='productTitle'][contains(.,'Céliane with Netatmo')or contains(.,'Legrand')])[1]/preceding::div[@data-test-id='product-panel'])+1"))
Castorama pour ré-u-ssir !
Upvotes: 0
Reputation: 24930
They really hid the data very deeply on that page....
Try (using "Céliane" for this example; obviously you can modify it further):
=IMPORTXML($B4,"count((//li[@class='b9bdc658'][.//p[@data-test-id='productTitle'][contains(.,'Céliane')]])[1]/preceding-sibling::*)+1")
Upvotes: 1