James
James

Reputation: 115

List can not be serialized error when using Xpath with lxml etree

I am trying to search for a string within an XML document, and then print out the entire element, or elements, that contain that string. This is my code so far:

post = open('postf.txt', 'r')
postf = str(post.read())

root = etree.fromstring(postf)

e = root.xpath('//article[contains(text(), "stuff")]')

print etree.tostring(e, pretty_print=True)

This is the XML that is being searched from postf.txt

<stuff>

<article date="2014-05-18 17:14:44" title="Some stuff">More testing
debug
[done]
<tags>Hello and stuff
</tags></article>

</stuff>

And finally, this is my error:

  File "cliassis-1.2.py", line 107, in command
    print etree.tostring(e, pretty_print=True)
  File "lxml.etree.pyx", line 3165, in lxml.etree.tostring (src\lxml\lxml.etree.c:69414)
TypeError: Type 'list' cannot be serialized.

What I want this to do, is search for all elements containing the string I searched for, and then print out the tags. So if I have test and stuff, and I search for 'test', I want it to print out "test and stuff

Upvotes: 4

Views: 13511

Answers (3)

Learner
Learner

Reputation: 5302

You can also use built-in join function like this.

e = root.xpath('//article[contains(text(), "stuff")]')
joined_string = "".join(e)//list to string conversion
print joined_string

Upvotes: 2

Attila123
Attila123

Reputation: 1052

Here is an executable and working solution, which also uses join (but correctly) - using list comprehension:

from lxml import etree

root = etree.fromstring('''<stuff>

<article date="2014-05-18 17:14:44" title="Some stuff">stuff in text
<tags>Hello and stuff</tags>
</article>

<article date="whatever" title="Some stuff">no s_t_u_f_f in text
<tags>Hello and stuff</tags>
</article>

<article date="whatever" title="whatever">More stuff in text
<tags>Hello and stuff</tags>
</article>

</stuff>''')
articles = root.xpath('//article[contains(text(), "stuff")]')

print("".join([etree.tostring(article, encoding="unicode", pretty_print=True) for article in articles]))

(For encoding="unicode" see e.g. http://makble.com/python-why-lxml-etree-tostring-method-returns-bytes)

Upvotes: 1

unutbu
unutbu

Reputation: 880079

articles = root.xpath('//article[contains(text(), "stuff")]')

for article in articles:
    print etree.tostring(article, pretty_print=True)

root.xpath returns a Python list. So e is a list. etree.tostring converts lxml _Elements to strings; it does not convert lists of _Elements to strings. So use a for-loop to print the _Elements inside the list as strings.

Upvotes: 5

Related Questions