Reputation: 131
I would like to select the information of all child elements in very large xml file if its parent has certain information. If, as in the sample code, the attribute of the node sn contains elliptic="yes", then select the v node and retrieve attribute values (e.g. wd="vulgui").
<sentence>
<sadv arg="argM" func="cc" tem="tmp">
<sadv>
<grup.adv>
<r lem="després" pos="rg" wd="Després"/>
<sp>
<prep>
<s lem="de" pos="sps00" postype="preposition" wd="de"/>
</prep>
<sn entityref="nne">
<spec gen="m" num="p">
<z lem="15" ne="number" wd="15"/>
</spec>
<grup.nom gen="m" num="p">
<n gen="m" lem="any" num="p" pos="ncmp000" postype="common" sense="16:10917509" wd="anys"/>
<sp>
<prep>
<s lem="de" pos="sps00" postype="preposition" wd="de"/>
</prep>
<sn entityref="nne">
<spec gen="f" num="s">
<d coreftype="ident" entity="entity3" entityref="nne" gen="f" lem="el_seu" num="s" person="3" pos="dp3fs0" postype="possessive" wd="la_seva"/>
</spec>
<grup.nom gen="f" num="s">
<n gen="f" lem="creació" num="s" pos="ncfs000" postype="common" sense="16:00583085" wd="creació"/>
</grup.nom>
</sn>
</sp>
</grup.nom>
</sn>
</sp>
</grup.adv>
</sadv>
<f lem="," pos="fc" punct="comma" wd=","/>
</sadv>
<sn arg="arg0" coreftype="ident" **elliptic="yes"** entity="entity3" entityref="nne" func="suj" tem="agt"/>
<grup.verb>
<v lem="presentar" lss="A32.ditransitive-patient-benefactive" mood="indicative" num="p" person="3" pos="vmip3p0" postype="main" tense="present" **wd="presenten"**/>
</grup.verb>
<sn arg="arg1" entityref="spec" func="cd" tem="pat">
<spec gen="m" num="s">
<d gen="m" lem="un" num="s" pos="di0ms0" postype="indefinite" wd="un"/>
</spec>
<grup.nom gen="m" num="s">
<s.a gen="m" num="s">
<grup.a gen="m" num="s">
<a gen="m" lem="nou" num="s" pos="aq0ms0" postype="qualificative" wd="nou"/>
</grup.a>
</s.a>
<n gen="m" lem="disc" num="s" pos="ncms000" postype="common" sense="16:03112307" wd="disc"/>
<sn entityref="ne" ne="other">
<f lem="," pos="fc" punct="comma" wd=","/>
<grup.nom>
<f lem="'" pos="fz" punct="mathsign" wd="'"/>
<n lem="Electroretard" ne="other" pos="np0000a" postype="proper" sense="16:cs1" wd="Electroretard"/>
<f lem="'" pos="fz" punct="mathsign" wd="'"/>
</grup.nom>
</sn>
</grup.nom>
</sn>
<f lem="." pos="fp" punct="period" wd="."/>
I couldn't come up with a solution after:
for sn in root.iter('sn'):
rank = sn.get('elliptic')
if rank == 'yes':
How could I continue this line of code? I thought something like:
"iterate through all children whose parents contain @elliptic="yes"
Upvotes: 0
Views: 659
Reputation: 812
Well as I understand the simplest way is to build xpath and put it in try ->if/except block:
xpath = '(//sn[@elliptic="yes"])[1]'
Now create a if statement that would check if this element is in you xml group and if it exists, then do what you need. E.g. if this true, then use another xpath's or etc to extract what is needed.
p.s. this [1]
means that you are searching for 1st element in xml, if there is more then 1 then without it, it can break. So create iterator i
that would go in your xpath (//sn[@elliptic="yes"])[i]
Upvotes: 1