Reputation: 1067
I am using lxml to read my xml file. I am using a code something like below. It works just fine with lxml2.3 beta1, but with lxml2.3 it gives me zn xml syntax error as shown below. I went through the release notes for both versions, but could not figure out what could have caused this error or how to fix it. Please help if you have come across such a thing or have any clues about it.
Thanks!!
Code:
from lxml import etree
def parseXml(context,attribList,elemList):
for event, element in context:
if element.tag in elemList:
#read element attributes
element.clear()
def main(object):
ns='{NS}'
attribList=['name','age','id']
elemList=[ns+'Employee',ns+'Experience',ns+'Employment',ns+'Project',ns+'Award']
context=etree.iterparse(fullFilePath, events=("start","end"))
parseXml(context,attribList,elemList)
Error:
File "iterparse.pxi", line 478, in lxml.etree.iterparse.next (src/lxml/lxml.etree.c:95348) File "iterparse.pxi", line 530, in lxml.etree.iterparse._read_more_events (src/lxml/lxml.etree.c:95886) File "parser.pxi", line 585, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:71955) XMLSyntaxError: Namespace default prefix was not found, line 545, column 73
xml sample -
<root xmlns='NS'>
<Employee Name="Mr.ZZ" Age="30">
<Experience TotalYears="10" StartDate="2000-01-01" EndDate="2010-12-12">
<Employment id = "1" EndTime="ABC" StartDate="2000-01-01" EndDate="2002-12-12">
<Project Name="ABC_1" Team="4">
</Project>
</Employment>
<Employment id = "2" EndTime="XYZ" StartDate="2003-01-01" EndDate="2010-12-12">
<PromotionStatus>Manager</PromotionStatus>
<Project Name="XYZ_1" Team="7">
<Award>Star Team Member</Award>
</Project>
</Employment>
</Experience>
</Employee>
</root>
The 'Employee' are repeated within the root. And the error happens after the parser has gone though many of the employees correctly.
Edit 1: On capturing the exception, I catch the following:
WARNING:NAMESPACE:NS_ERR_UNDEFINED_NAMESPACE: Namespace default prefix was not found
Upvotes: 0
Views: 3169
Reputation: 1067
Ok, so I finally figured out what was going on. Following a good advice to clean up used elements, I was clearing up all the elements, including the root node. The root node is the one with the default namespace prefix which applies to all nodes within that root. Since I cleared off my root node, the default namespace prefix was no longer a part of the nsmap of its subelements. The previous versions seem to be forgiving of this but the latest version was more strict in this sense.
Not clearing the root element untill I was done reading the xml did the trick for me.
Upvotes: 2
Reputation: 5877
default namespace problems most often arise when you are attempting xpath expression. For just parsing the stream as in your sample, 2.3.0 should work fine with an unnamed default namespace.
Perhaps you should post the smallest possible xml file that gives this error (line 545 is pretty deep into the file to have this error)
Upvotes: 0