RaymanSix
RaymanSix

Reputation: 43

Selenium how get/extract the XPATH for a webelement? Python

I know how to find a webelement using XPATH like:

fruit = webdriver.find_element(By.XPATH, '/div/div[1]/div[2]').text

Output 
fruit = 'banana'

But what I really want is to do the reverse:

banana_path = webdriver."someway get the XPATH"(text = 'banana')

Output 
banana_path = '/div/div[1]/div[2]'

I want to do this because first I scrape all the times that have in the site, so that when one is equals to 10 (for example) I go back to the site and scrape the text that matches it. Unfortunately, there are dozens of pieces of information (with the same name for the class) that keep increasing or decreasing according to demand. That's why I need to get XPATH, because with it I would be able to go directly to what I want to find.

For example if I got that XPATH of Time:

time_path = '/div[1]/div/div/div/div/div[1]/div[1]/div[2]/div[3]'

I could find and scrape text that has an XPATH that is a near position

webdriver.find_element(By.XPATH, '/div[1]/div/div/div/div/div[1]/div[1]/span/div').text

I found a answer about that in stack overflow, but I'm using Python and not JavaScript.

Find an element by text and get xpath - selenium webdriver junit

I also found this answer teaching how to do that with urllib2 and lxml, however I'm entering a site where its protection against automation is strong and I was only able to enter with Selenium.

How to get an XPath from selenium webelement or from lxml?

I really appreciate your help because this is the last missing part of my automation

Upvotes: 3

Views: 1606

Answers (1)

CYCNO
CYCNO

Reputation: 86

I got your problem I used selenium and lxml as you already told to use the both module. I don't know that my method will work properly or not because i use lxml part from the second link of your question How to get an XPath from selenium webelement or from lxml?

so here is my approach

#First get website data using selenium 

from selenium import webdriver

url = ''

driver = webdriver.Chrome('path/to/driver')
driver.get(url)

data = driver.page_source()

#then get your xpath using lxml because you aleready have the data above

from lxml import etree

xpath = ''

tree = etree.parse(data)
element = tree.xpath(xpath)[0]
print(tree.getpath(element))

Upvotes: 2

Related Questions