Reputation: 8047
I am using xPath to try to get the title
text from the following section of a table:
<td class="title" title="if you were in a job and then one day, the work..." data-id="3198695">
<span id="thread_3198695" class="titleline threadbit">
<span class="prefix">
</span>
<a id="thread_title_3198695" href="showthread.php?t=3198695">would this creep you out?</a>
<span class="thread-pagenav">(Pgs:
<span><a href="showthread.php?t=3198695">1</a></span> <span><a href="showthread.php?t=3198695&page=2">2</a></span> <span><a href="showthread.php?t=3198695&page=3">3</a></span> <span><a href="showthread.php?t=3198695&page=4">4</a></span>)</span>
</span>
<span class="byline">
by
<a href="member.php?u=1687137" data-id="3198695" class="username">
damoni
</a>
</span>
</td>
The output I want is: "if you were in a job and then one day, the work..."
I have been trying various expressions in Scrapy
(python) to try and get the title
. It outputs a weird text such as: '\n\n \r \r \n \n\n\r'
response.xpath("//tr[3]/td[@class='title']/text()")
I know that the following part is correct, at least (I verified it locates the correct table element using Chrome's developer tools:
//tr[3]/td
# (This is the above snippet)
Any idea as to how I can extract the title
?
Upvotes: 1
Views: 5397
Reputation: 157967
You want:
response.xpath("//tr[3]/td[@class='title']/@title")
Note that text()
selects the text content of a node but @attribute
the value of an attribute. Since the desired text is stored in the title attribute you need to use @title
.
Upvotes: 3