Reputation: 211
I'm making a selenium script on python that goes to a classifieds page, obtains the info of one of my posts, deletes it and the repost it. However I'm stuck here: Is there a way of getting the text inside the following html, and saving either as a local variable inside the script or as a text file?
The thing is, I'm trying to make the script generic, and since the text in my post vary, Im not sure if finding by xpath will work. Is there a way to target the div by its id, and returning the text inside it?
<div id="UserContent">
<table>
<tbody>
<tr>
<td>
<span itemprop="description">
"Text I need"
</span>
</td>
</tr>
</tbody>
</table>
</div>
Upvotes: 0
Views: 1231
Reputation: 180391
The text is irrelevant, you should be using the id etc..:
"//*[@id='UserContent']//span[@itemprop='description']//text()"
So in selenium:
text = driver.find_element_by_xpath("//*[@id='UserContent']//span[@itemprop='description']").text
Or use a css selector:
text = driver.find_element_by_css("#UserContent span[itemprop=description]").text
An example use lxml:
In [12]: from lxml import html
In [13]: h = """<div id="UserContent">
....: <table>
....: <tbody>
....: <tr>
....: <td>
....: <span itemprop="description">
....: "Text I need"
....: </span>
....: <span itemprop="foo">bar</span>
....: </td>
....: </tr>
....: </tbody>
....: </table>
....: </div>
....:
....: """
In [14]: tree = html.fromstring(h)
In [15]: print(tree.cssselect("#UserContent span[itemprop=description]")[0].text)
"Text I need"
In [16]: print(tree.xpath("//*[@id='UserContent']//span[@itemprop='description']//text()")[0])
"Text I need"
In [17]:
Upvotes: 1