Reputation: 3756
I have a template XML file, and based on inputs given to my program I have to generate a new XML file. The template has sections that need to be repeated based on the input data. But I don't necessarily know the structure of these sections or how many levels of nesting they have. I cannot figure out how to read in the template file in an arbitrary way they will let me populate it and then output it. Here is a section of the template file:
<Target_Table>
<Target_Name>SF1_T1</Target_Name>
<Target_Mode>
<REP>
<Target_Location_To_Repeat>
<XLocation>nextXREL</XLocation>
<YLocation>nextYREL</YLocation>
</Target_Location_To_Repeat>
<Target_Location_To_Repeat>
<XLocation>nextXREL</XLocation>
<YLocation>nextYREL</YLocation>
</Target_Location_To_Repeat>
</REP>
</Target_Mode>
<Target_Repetitions>1</Target_Repetitions>
<Meas_Window>
<Window_Size>
<XLocation>FOV</XLocation>
<YLocation>FOV</YLocation>
</Window_Size>
<Window_Location>
<XLocation>firstXREL</XLocation>
<YLocation>firstYREL</YLocation>
</Window_Location>
</Meas_Window>
<Box_Orientation>90</Box_Orientation>
<First_Feature Value="Space" />
<Meas_Params_Definition>
<Number_Of_Lines Value="Auto" />
<Number_Of_Pixels_Per_Line Value="Auto" />
<Averaging_Factor Value="1" />
</Meas_Params_Definition>
<Number_Of_Edges>1</Number_Of_Edges>
<Edge_Pair>
<Edge_Pair_Couple>
<First_Edge>1</First_Edge>
<Second_Edge>1</Second_Edge>
</Edge_Pair_Couple>
<Nominal_Corrected_Value>0</Nominal_Corrected_Value>
</Edge_Pair>
<Categories>
<Material_Type />
<Meas_Type />
<Category_Type />
<Other_Type />
</Categories>
<Bias>0</Bias>
<Template_Target_Name>SF_IMAQ_Template_Target</Template_Target_Name>
<Template_Target_PPL>
<Process>PC2</Process>
<Product>PD2</Product>
<Layer>L2</Layer>
</Template_Target_PPL>
<Meas_Auto_Box>
<Error_Code>0</Error_Code>
<Measured_CD>0</Measured_CD>
<Constant_NM2Pix>true</Constant_NM2Pix>
</Meas_Auto_Box>
<Meas_Box_Pix_Size_X>PixelSize</Meas_Box_Pix_Size_X>
<Macro_CD>0</Macro_CD>
</Target_Table>
I need to repeat the entire Target_Table section multiple time, and within each Target_Table I need to repeat the REP section multiple times. I want to write my program so that if the template changes (e.g, more levels of nesting are added) I don't have to change my program. But it seems to me that I have to totally know the structure of the file to read it in and spit it out. Is that true or am I missing something here? Is there a way to write a program that will read in a file with an unknown tags and unknown levels of nesting?
Upvotes: 4
Views: 1977
Reputation: 384
Using ElementTree:
import xml.etree.ElementTree as et
filehandler = open("file.xml","r")
raw_data = et.parse(filehandler)
data_root = raw_data.getroot()
filehandler.close()
for children in data_root:
for child in children:
print(child.tag, child.text, children.tag, children.text)
That will give you an overview of the XML-tags and associated text inside tags. You can add more loops to step further into the tree, and perform checks to see wether any of the children contains further levels. I find this method useful when the name of the XML tags varies and does not follow an already known standard.
Upvotes: 4
Reputation: 97918
An example using BeautifulSoup:
import sys
from bs4 import BeautifulSoup
file = sys.argv[1]
handler = open(file).read()
soup = BeautifulSoup(handler)
for table in soup.find_all("target_table"):
for loc in table.find_all("rep"):
print loc.xlocation.string + ", " + loc.ylocation.string
Output
nextXREL, nextYREL
Upvotes: 0