Cadman
Cadman

Reputation: 1

iterating through a XML file using Python

I am pretty new to Python, just finishing up a college class on it, and am also working through a book called "Head First Python", so I have a good basic understanding of the language but I need some help with a task that is a bit over my head. I have an XML file that my companies CAD software reads in to assign the correct material to a 3D solid model, in our case assigning a material name and density. However, this XML has its units in lbf.s^2/in^4 but it's been requested that they be in lbm/in^3 by one of our customers.

Can I import the XML into Python and iterate through it to change the values?

The first would be the material unit name itself, I would need to iterate through the XML and replace every instance of this:

    <PropertyData property="MassDensity_0">

With this:

    <PropertyData property="MassDensity_4">

Then for every instance found of MassDensity_0 I would need to multiply the density value by a conversion value to get the correct density value in the new units. As you can see below, the data values like the one you see below would need to be multiplied by a conversion factor.

        <PropertyData property="MassDensity_0">
            <Data format="exponential">
            7.278130e-04</Data>
        </PropertyData>

Does it make sense to attempt this in Python? There are over a hundred materials in this file, and editing them manually would be very tedious and time-consuming. I'm hoping Python can do the heavy lifting here.

I appreciate any assistance you can provide and thank you in advance!!!

Upvotes: 0

Views: 80

Answers (2)

Yuri Khristich
Yuri Khristich

Reputation: 14502

I'm sure it can and probably should be done with a special xml module. But rather for educational purposes here is the straightforward verbose Python solution:

import re

xml_in = \
'''<PropertyData property="MassDensity_0">
    <Data format="exponential">
        7.278130e-04
    </Data>
</PropertyData>
<PropertyData property="MassDensity_0">
    <Data format="exponential">
        7.278130e-04
    </Data>
</PropertyData>
'''

# remove spaces after ">" and before "<"
xml_in = re.sub(">\s*",">", xml_in)
xml_in = re.sub("\s*<","<", xml_in)

# split the xml by the mask
mask = "<PropertyData property=\"MassDensity_0\"><Data format=\"exponential\">"
chunks = xml_in.split(mask)

# change numbers
result = []
for chunk in chunks:
    try:
        splitted_chunk = chunk.split("<") # split the chunk by "<"
        num = float(splitted_chunk[0])    # get the number
        num *= 2                          # <--- change the number
        num = f"{num:e}"                  # get e-notation
        new_chunk = "<".join([num] + splitted_chunk[1:]) # make the new chunk # make the new chunk
        result.append(new_chunk)          # add the new chunk to the result list
    except:
        result.append(chunk)              # if there is no number add the chunk as is

# assembly xml back from the chunks and the mask
xml_out = mask.join(result)

# output
print(">\n<".join(xml_out.split("><")))

Output:

<PropertyData property="MassDensity_0">
<Data format="exponential">1.455626e-03</Data>
</PropertyData>
<PropertyData property="MassDensity_0">
<Data format="exponential">1.455626e-03</Data>
</PropertyData>

Upvotes: 0

Daweo
Daweo

Reputation: 36390

This looks like task for built-in module xml.etree.ElementTree. After loading XML from file or string, you might alter it and save changed one to new file. It does support subset of XPath, which should allow to select elements to change, consider following simple example:

import xml.etree.ElementTree as ET
xml_string = '<?xml version="1.0"?><catalog><product id="p1"><price>100</price></product><product id="p2"><price>120</price></product><product id="p3"><price>150</price></product></catalog>'
root = ET.fromstring(xml_string)  # now root is <catalog>
price3 = root.find('product[@id="p3"]/price')  # get price of product with id p3
price3.text = str(int(price3.text) + 50)  # .text is str, convert to int and adding 50 to it, then back to int
output = ET.tostring(root)
print(output)

output

b'<catalog><product id="p1"><price>100</price></product><product id="p2"><price>120</price></product><product id="p3"><price>200</price></product></catalog>'

Note that output is bytes and as such can be written to file open in binary mode. Consult docs for more information.

Upvotes: 1

Related Questions