user811602
user811602

Reputation: 1354

Changing XML node in file with sed or tr or perl

i have one xml file. lets say sample.xml with random position of tabs and spaces:

<T1>
     <S1 >  D1 </S1>
 <S1>D2   </  S1>
 < S2 >D3  </S2>
 <S3> D4</S3>
</T1 >

I want to change data and format to something like this

<T1>
 <S1>D5</S1>
 <S1>D6</S1>
 <S2>D7</S2>
 <S3>D8</S3>
</T1>

I tried in sed, but it is not working for multiple line case as here. How i can achieve same.

Upvotes: 0

Views: 544

Answers (4)

Mark O&#39;Connor
Mark O&#39;Connor

Reputation: 77991

Remove all whitespace from the file and then format it using xmllint

$ sed 's/[[:space:]]//g' test.xml | xmllint --format -
<?xml version="1.0"?>
<T1>
  <S1>D1</S1>
  <S1>D2</S1>
  <S2>D3</S2>
  <S3>D4</S3>
</T1>

Background

As pointed out by @choroba, your input data it not a valid XML file:

$ cat test.xml
<T1>
     <S1 >  D1 </S1>
      <S1>D2   </  S1>
       < S2 >D3  </S2>
        <S3> D4</S3>
        </T1 >

The xmllint command states why:

$ xmllint test.xml
test.xml:3: parser error : expected '>'
      <S1>D2   </  S1>
                   ^
test.xml:3: parser error : Opening and ending tag mismatch: S1 line 3 and unparseable
      <S1>D2   </  S1>
                   ^
test.xml:4: parser error : StartTag: invalid element name
       < S2 >D3  </S2>
        ^
test.xml:4: parser error : Opening and ending tag mismatch: T1 line 1 and S2
       < S2 >D3  </S2>
                      ^
test.xml:5: parser error : Extra content at the end of the document
        <S3> D4</S3>
        ^

Upvotes: 1

jaypal singh
jaypal singh

Reputation: 77145

This should work - tr -d ' ' < file

Your file:

[jaypal:~/Temp] cat file
<T1>
     <S1 >  D1 </S1>
 <S1>D2   </  S1>
 < S2 >D3  </S2>
 <S3> D4</S3>
</T1 >

Test:

[jaypal:~/Temp] tr -d ' ' < file
<T1>
<S1>D1</S1>
<S1>D2</S1>
<S2>D3</S2>
<S3>D4</S3>
</T1>

Upvotes: 1

choroba
choroba

Reputation: 241998

Spaces after < or </ are not permitted in XML. Your XML is not well-formed and therefore cannot be processed by specialized tools. Normaly, this should work:

xmllint --format file.xml

Upvotes: 1

Kent
Kent

Reputation: 195209

 sed -r 's/\s//g' yourXML

does the above sed line work?

kent$  cat v.xml
<T1>
     <S1 >  D1 </S1>
 <S1>D2   </  S1>
 < S2 >D3  </S2>
 <S3> D4</S3>
</T1 >

kent$  sed -r 's/\s//g' v.xml
<T1>
<S1>D1</S1>
<S1>D2</S1>
<S2>D3</S2>
<S3>D4</S3>
</T1>

you should make sure that in your xml file, there is no any spaces in tags and values.

Upvotes: 1

Related Questions