Reputation: 29
I have the following XML I am parsing through to get some values and make store them in a file, I am able to get attributes however when I try to pull the value I need I get none
when using x.get()
like in the example, could you help me understand what's happening? and how can I get the values I need?
This is an example of the xml:
<?xml version='1.0' encoding='UTF-8'?>
<sbml xmlns:fbc="http://www.sbml.org/sbml/level3/version1/fbc/version2" level=3 sboTerm="SBO:0000624" version="1" xmlns=""http://www.sbml.org/sbml/level3/version1/core" fbc:required="false">
<model id="iJR904" fbc:strict="true">
<listOfUnitDefinitions> fobo </listOfUnitDefinitions>
<fbc:listOfObjectives fbc:activeObjective="obj> Objectives </fbc:listOfObjectives>
<listOfParameters>
<parameter constant="true" id="cobra_default_lb" sboTerm="SBO:0000626" units="mmol_per_gDW_per_hr" value="-999999" />
<parameter constant="true" id="cobra_default_ub" sboTerm="SBO:0000626" units="mmol_per_gDW_per_hr" value="999999" />
<parameter constant="true" id="cobra_0_bound" sboTerm="SBO:0000626" units="mmol_per_gDW_per_hr" value="0" />
<parameter constant="true" id="R_ATPM_upper_bound" sboTerm="SBO:0000625" units="mmol_per_gDW_per_hr" value="7.6" />
<parameter constant="true" id="R_ATPM_lower_bound" sboTerm="SBO:0000625" units="mmol_per_gDW_per_hr" value="7.6" />
<parameter constant="true" id="R_EX_glc_DASH_D_e_lower_bound" sboTerm="SBO:0000625" units="mmol_per_gDW_per_hr" value="-10" />
<parameter constant="true" id="R_EX_o2_e_lower_bound" sboTerm="SBO:0000625" units="mmol_per_gDW_per_hr" value="-20" />
</listOfParameters>
<listOfCompartments> Compartments</listOfCompartments>
<listOfSpecies> Species</listOfSpecies>
<fbc:listOfGeneProducts> Products </fbc:listOfGeneProducts>
<listOfReactions>
<reaction fast="false" id="R_12PPDt" name="S-Propane-1,2-diol facilitated transport" reversible="true" fbc:lowerFluxBound="cobra_default_lb" fbc:upperFluxBound="cobra_default_ub">
</reaction>
<reaction fast="false" id="R_2DGLCNRx" name="2-dehydro-D-gluconate reductase (NADH)" reversible="false" fbc:lowerFluxBound="cobra_0_bound" fbc:upperFluxBound="cobra_default_ub">
</reaction>
</listOfReactions>
</model>
</sbml>
The code I wrote in Python is like this
import xml.etree.ElementTree as ET
mytree = ET.parse('iJR904.xml')
myroot = mytree.getroot()
mychild = myroot[0]
for a in range(len(mychild[6])):
print(mychild[6][a].get('id'),':',mychild[6][a].get('fbc:lowerFluxBound'))
It prints:
R_12PPDt : none
R_2DGLCNRx : none
Since I want value not really the text I thought I could use some if
like this:
for a in range(len(mychild[6])):
if mychild[6][a].get('fbc:lowerFluxBound') == 'cobra_default_lb':
print(mychild[6][a].get('id'),':',-999999)
elif mychild[6][a].get('fbc:lowerFluxBound') == 'cobra_0_bound':
print(mychild[6][a].get('id'),':',0)
Upvotes: 0
Views: 88
Reputation: 2469
You're a good researcher and can figure out what's wrong. This should be caused by the namespace in the XML syntax. The following library ignores the syntax of the namespace and treats it as a normal string.
from simplified_scrapy import SimplifiedDoc, utils
xml = utils.getFileContent('iJR904.xml')
doc = SimplifiedDoc(xml)
listOfReactions = doc.selects('listOfReactions>reaction')
# print (listOfReactions)
print ([(a.id,a['fbc:lowerFluxBound']) for a in listOfReactions])
Result:
[('R_12PPDt', 'cobra_default_lb'), ('R_2DGLCNRx', 'cobra_0_bound')]
Upvotes: 1