Reputation: 373
I have an xml file like this:
<location type="journal">
???INSERT location???
<journal title="J. Gen. Virol.">
<volumn> 84 </volumn>
<page start="2305" end="2315"/>
<year> 2003 </year>
</journal>
</location>
I am iterating through the file like so:
tree_out = etree.parse(xmlfile.xml)
updatedtext_head = '???UPDATE FROM '
insert_head = '???INSERT '
delete_head = '???DELETE '
updatedattrib_head = '???UPDATE '
updatedattrib_mid = ' FROM '
mark_end = '???'
every = 60
G = nx.DiGraph()
color_list=[]
node_text=[]
inserted_out=[]
deleted_out=[]
updatedtext_out=[]
others_out=[]
updatedattrib_out=[]
old_new_attrib_pairs=[]
full_texts=[]
for x in tree_out.iter():
for y in x.iterancestors():
if '???DELETE' in y.text and x not in deleted_out:
deleted_out.append(x)
if '???DELETE' in x.text and x not in deleted_out:
deleted_out.append(x)
for y in x.iterancestors():
if '???INSERT' in y.text and x not in inserted_out:
inserted_out.append(x)
if '???INSERT' in x.text and x not in inserted_out:
inserted_out.append(x)
if '???UPDATE FROM' in x.text and x not in updatedtext_out:
updatedtext_out.append(x)
if '???UPDATE ' in x.text and ' FROM ' in x.text and '???' in x.text and x not in updatedattrib_out and x not in updatedtext_out:
updatedattrib_out.append(x)
if (re.search(r'^\s+$', x.text)) and x not in others_out and x not in deleted_out and x not in inserted_out and x not in updatedtext_out and x not in updatedattrib_out:
others_out.append(x)
but when I encounter elements such as this:
<page start="2305" end="2315"/>
I get thrown this error:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-68-b66a7d063b5b> in <module>
121 deleted_out.append(x)
122
--> 123 if '???DELETE' in x.text and x not in deleted_out:
124 deleted_out.append(x)
125
TypeError: argument of type 'NoneType' is not iterable
The intended end result is that I want the elements in the list to be sorted into separate lists as I have done in the code segment above. Why does this error occur and how can I fix it?
Upvotes: 1
Views: 127
Reputation: 3514
The TypeError
is caused by the attribute-only element. Specifically, the element is represented by the variable x
and the code tests whether '???DELETE'
occurs within x.text
, but x.text
is None
because the text
attribute is where the element's content is stored. For reference, XML elements have the following structure:
<element-name attribute1 attribute2>content</element-name>
The error contains the message argument of type 'NoneType' is not iterable
because in
has the syntax value in iterable
. Specifically, x.text
must be an iterable
.
You should test that x.test
isn't None
before trying to use it like a str
.
if x.text is not None and '???DELETE' in x.text and x not in deleted_out:
deleted_out.append(x)
You never declared deleted_out
. Try this:
tree_out = etree.parse(xmlfile.xml)
deleted_out = []
for x in tree_out.iter():
for y in x.iterancestors():
if '???DELETE' in y.text and x not in deleted_out:
deleted_out.append(x)
Upvotes: 1