user3766332
user3766332

Reputation: 329

Python XPath scraping says list has no text attribute

I am using a code to scrape a PDF to generate a relevant dictionary. My code works when I access each text block individually, i.e

x = scraperwiki.pdftoxml(u.read())
    r = lxml.etree.fromstring(x)
    s = r.xpath('//page[@number="142"]/text[@left = "134"]')
    print s[8].text 

print s[0],s[1].. all seem to work but when I try the same for

x = scraperwiki.pdftoxml(u.read())
    r = lxml.etree.fromstring(x)
    s = r.xpath('//page[@number="142"]/text[@left = "134"]')
    print s[0:8].text

I get this error: AttributeError: 'list' object has no attribute 'text'

Can anyone tell me what's wrong?

Upvotes: 1

Views: 3130

Answers (1)

falsetru
falsetru

Reputation: 368894

text is an attribute of each element, not of the list.

Iterate each elements.

x = scraperwiki.pdftoxml(u.read())
r = lxml.etree.fromstring(x)
s = r.xpath('//page[@number="142"]/text[@left = "134"]')
for elem in s[:8]:
    print elem.text

or use list comprehension:

x = scraperwiki.pdftoxml(u.read())
r = lxml.etree.fromstring(x)
s = r.xpath('//page[@number="142"]/text[@left = "134"]')
print [elem.text for elem in s[:8]]

Upvotes: 1

Related Questions