Reputation: 687
I have this awk/sed command
awk '{full=full$0}END{print full;}' initial.xml | sed 's|</Product>|</Product>\
|g' > final.xml
to break an XML doc containing large number of tags such that the new file will have all contents of the product node in a single line
I am trying to run it using os.system and subprocess module however this is wrapping all the contents of the file into one line.
Can anyone convert it into equivalent python script? Thanks!
Upvotes: 0
Views: 566
Reputation: 189477
Something like this?
from __future__ import print_function
import fileinput
for line in fileinput.input('initial.xml'):
print(line.rstrip('\n').replace('</Product>','</Product>\n'),end='')
I'm using the print
function because the default print
in Python 2.x will add a space or newline after each set of output. There are various other ways to work around that, some of which involve buffering your output before printing it.
For the record, your problem could equally well be solved in just a simple Awk script.
awk '{ gsub(/<Product>/,"&\n"); printf $0 }' initial.xml
Printing output as it arrives without a trailing newline is going to be a lot more efficient than buffering the whole file and then printing it at the end, and of course, Awk has all the necessary facilities to do the substition as well. (gsub
is not available in all dialects of Awk, though.)
Upvotes: 1