Albz
Albz

Reputation: 2030

How to duplicate a source XML file and append in the output a progressive number to predefined attributes?

I have an XML file which contains a series of Groups, e.g.

   <Group id="CB1391262521L-10816-1390650339-936236343" name="This Speed" color="#FFFFFF">
                <Topology>
                    <TopologyRect cylinderHor="false" hcount="1" cylinderVert="false" vcount="1"/>
                </Topology>
                <Name="test">
                    <Parameter value="1" name="ssd"/>
                    <Parameter value="1" name="amp"/>
                </Name>
                <Note></Note>
                <DiagramIcon width="50" x="89" y="392" height="50"/>
</Group>
<Group id="L-14827-1391619839-708665346" name="Angle" color="#FFFFFF">
                <Topology>
                    <TopologyRect cylinderHor="false" hcount="1" cylinderVert="false" vcount="2"/>
                </Topology>
                <Name="test">
                    <Parameter value="3" name="ssd"/>
                    <Parameter value="2" name="amp"/>
                </Name>
                <Note></Note>
                <DiagramIcon width="50" x="89" y="392" height="50"/>
</Group>

and connections, e.g.

<Connection target="CB1391262521L-10816-1390650339-936236343" type="excitatory" id="L-22494-1391265621-2060625361-1" source="L-14827-1391619839-708665346" name="Connection Reflex->M">
            <Pattern></Pattern>
</Connection>

I would like to make a unix shell script (using for example grep) that duplicates this files N times but for each copy created appends the string -01, -02 and so on (depending on N) to all the values contained in the id, target and source attributes of any tag in the XML file (Group, Connection, and others). The script should have as parameters the orginal .xml file and the number of copies required, e.g.

myScript.py OriginalFile.xml 5

Where 5 is the number of copies, the output of the script produces 5 .xml files where the first file has -01 appendend to all the values contained in the id, target and source attributes, the second file will have -02 etc...

I think it's a regex search and replace, ant there is no need to parse the original XML file in Python, but simply read it line by line as a normal text file.

Upvotes: 0

Views: 147

Answers (2)

BMW
BMW

Reputation: 45243

Pure shell

#!/usr/bin/env bash

xml=$1
n=$2
for i in $(seq $n)
do
  i=$(printf "%02s" $i)
  sed -r "s/((id|target|source)=\"[^\"]+[0-9])/\1-$i/g" $xml > $i.xml
done

how to run

myScript.sh OriginalFile.xml 5

then you will get 5 xml files: 01.xml, 02.xml, etc.

Upvotes: 1

Variant
Variant

Reputation: 17365

Python Solution:

import re    
n=5    
with open("source.xml","r") as source:
    xml = source.read()

for i in range(n):
    with open("out-%d.xml" % i,"w") as target:
        modxml = re.sub(r'(id|target|source)="([^"]*)"',r'\1="\2-%d"' % i,xml)
        target.write(modxml) 

Upvotes: 1

Related Questions