BBedit
BBedit

Reputation: 8047

xPath: How to get 'title' text from table?

I am using xPath to try to get the title text from the following section of a table:

    <td class="title" title="if you were in a job and then one day, the work..." data-id="3198695">
        <span id="thread_3198695" class="titleline threadbit">

            <span class="prefix">



            </span>
            <a id="thread_title_3198695" href="showthread.php?t=3198695">would this creep you out?</a>

            <span class="thread-pagenav">(Pgs:
                 <span><a href="showthread.php?t=3198695">1</a></span> <span><a href="showthread.php?t=3198695&amp;page=2">2</a></span> <span><a href="showthread.php?t=3198695&amp;page=3">3</a></span> <span><a href="showthread.php?t=3198695&amp;page=4">4</a></span>)</span>

        </span>
        <span class="byline">


                by
                <a href="member.php?u=1687137" data-id="3198695" class="username">
                    damoni
                </a>

        </span>

</td>

The output I want is: "if you were in a job and then one day, the work..."

I have been trying various expressions in Scrapy (python) to try and get the title. It outputs a weird text such as: '\n\n \r \r \n \n\n\r'

 response.xpath("//tr[3]/td[@class='title']/text()")

I know that the following part is correct, at least (I verified it locates the correct table element using Chrome's developer tools:

//tr[3]/td
# (This is the above snippet)

Any idea as to how I can extract the title?

Upvotes: 1

Views: 5397

Answers (1)

hek2mgl
hek2mgl

Reputation: 157967

You want:

response.xpath("//tr[3]/td[@class='title']/@title")

Note that text() selects the text content of a node but @attribute the value of an attribute. Since the desired text is stored in the title attribute you need to use @title.

Upvotes: 3

Related Questions