Reputation: 20150
I have the following xml:
<text>test<br/><br/><a href="/nature/19700707">All you need to know about British birds.</a><br/></text>
I am wishing to set the whole content of the tag <text>
to 11111
I'm using pythong and lxml and the following are my codes:
import nltk
import lxml.etree as le
current_file = '/Users/noor/Dropbox/apps/APIofLife/src/clear_description/bird.rdf'
f = open(current_file,'r')
doc=le.parse(f)
for elem in doc.xpath("//text"):
elem.text = "11111"
f.close()
f = open(current_file,'w')
f.write(le.tostring(doc))
f.close()
However, after running the above codes, my results are:
<text>11111<br/><br/><a href="/nature/19700707">All you need to know about British birds.</a><br/></text>
I want to know why the whole content of the tag <text>
has not been changed to 11111
Upvotes: 1
Views: 145
Reputation: 369134
According to lxml.etree._Element
documentation, text
property correspond to the text before the first subelement.
You need to delete sub elements:
>>> import lxml.etree as le
>>>
>>> root = le.fromstring('''<text>test<br/><br/>
... <a href="/nature/19700707">All you need to know about British birds.</a>
... <br/></text>''')
>>> for elem in root.xpath("//text"):
... elem.text = '1111'
... del elem[:] # <----------
...
>>> le.tostring(root)
'<text>1111</text>'
Upvotes: 1