Dumitru Gutu
Dumitru Gutu

Reputation: 579

How to insert the content of a file into another file before a pattern

I have a file Afile :

<start>
<memory>
<hdd>10</hdd>
<hdc>40</hdc>
</memory>
<storage>
<disk>
<disk1>firstname</disk1>
</disk>
<disk>
<disk1>secondname</disk1>
</disk>
<map>
<code>1</code>
</map> 
<map>
<code>2</code>
</map> 
</storage>
</start>

I have the second file Bfile:

<disk>
<disk1>thirdname</disk1>
</disk>

How using sed I can insert content of Bfile into Afile. So finally I need to have the following file:

<start>
<memory>
<hdd>10</hdd>
<hdc>40</hdc>
</memory>
<storage>
<disk>
<disk1>firstname</disk1>
</disk>
<disk>
<disk1>secondname</disk1>
</disk>
<disk>
<disk1>thirdname</disk1>
</disk>
<map>
<code>1</code>
</map> 
<map>
<code>2</code>
</map> 
</storage>
</start>

So it should be inserted after the last pattern. When I use the following command I get the following result:

sed -e '/disk>/rBfile' Afile

<start>
<memory>
<hdd>10</hdd>
<hdc>40</hdc>
</memory>
<storage>
<disk>
<disk1>firstname</disk1>
</disk>
<disk>
<disk1>thirdname</disk1>
</disk>
<disk>
<disk1>secondname</disk1>
</disk>
<disk>
<disk1>thirdname</disk1>
</disk>
<map>
<code>1</code>
</map> 
<map>
<code>2</code>
</map> 
</storage>
</start>

So it put the content of Bfile after each occurence of "disk>". I need just the last occurence. How to change the command?

Upvotes: 1

Views: 160

Answers (6)

potong
potong

Reputation: 58420

This might work for you (GNU sed):

sed -e '/<disk>/,${/<disk>/,/<\/disk>/b;ecat fileb' -e ':a;n;ba}' filea

This restricts the sed commands to those lines beginning with <disk> to the end of the file. Within this range all complete <disk>/<\/disk> tags are printed as usual. The following line is where the file is to be inserted and using the sed evalute command the file is immediately inserted (rather than using the r command which inserts the file following the current pattern space). The rest of the file is then printed using a simple loop.

Upvotes: 0

Gustavo Baseggio
Gustavo Baseggio

Reputation: 19

Just to add some examples using AWK.

Assuming that we have:

afile:

<start>
<memory>
<hdd>10</hdd>
<hdc>40</hdc>
</memory>
<storage>
<disk>
<disk1>firstname</disk1>
</disk>
<disk>
<disk1>secondname</disk1>
</disk>
</storage>
</start>

and bfile:

<disk>
<disk1>thirdname</disk1>
</disk>

AWK using </storage> tag as reference:

awk '/^<\/storage>/{while(getline line<"bfile"){print line};print;next}1' afile

That will result in:

<start>
<memory>
<hdd>10</hdd>
<hdc>40</hdc>
</memory>
<storage>
<disk>
<disk1>firstname</disk1>
</disk>
<disk>
<disk1>secondname</disk1>
</disk>
<disk>
<disk1>thirdname</disk1>
</disk>
</storage>
</start>

But in case you REALLY need to look for </disk>, I would do something like:

awk -v n=4 '{print;}/<\/disk1>$/,/^<\/disk>/{m++}(m==n){n=0;while(getline l<"bfile"){print l}}' afile

In addition, you can also use xmllint to format the output for you:

awk -v n=4 '{print;}/<\/disk1>$/,/^<\/disk>/{m++}(m==n){n=0;while(getline l<"bfile"){print l}}' afile | xmllint --format --recover -

That will result in:

<start>
  <memory>
    <hdd>10</hdd>
    <hdc>40</hdc>
  </memory>
  <storage>
    <disk>
      <disk1>firstname</disk1>
    </disk>
    <disk>
      <disk1>secondname</disk1>
    </disk>
    <disk>
      <disk1>thirdname</disk1>
    </disk>
  </storage>
</start>

Upvotes: 2

NeronLeVelu
NeronLeVelu

Reputation: 10039

if limited by storage (first sample given)

sed '\#</storage># {r Bfile
   N;} ' Afile

if last disk in storage (like this edited version of the request)

sed '1;\#<storage>#{1h;1!H
    \#<storage># {g
       s#^\(.*\n</disk>\).*#\1#p
       r Bfile
       G;N
       s/^\(.*\)\1\(.*\)/\2/
       }
   }' Afile

Normaly sed script loop to next line after a r action (and does not read rest of script for this line) but with a N after, it continue AND keep the line in buffer for action (in this case with the next one).

So only works IF there is a line after storage (could add a test before with a if/the/else action in this case)

Upvotes: 2

Serge Ballesta
Serge Ballesta

Reputation: 148910

If ed is an option (if the input file is not too big), it would be easier :

echo '/map/-1 r Bfile
wq' | ed Afile

Upvotes: 0

Alfwed
Alfwed

Reputation: 3282

I didn't manage to do that in a single line so i made a sed script. The problem is that the r command will not work if there are chars after the file name so it needs to be on it's own line.

#!/bin/sed -f

/<\/disk>/{
  :a 
  n
  s/disk/disk/
  t a
  h
  r bbb
  g
  N
}

You can then call it like this :

sed -f sedscript Afile

Upvotes: 3

Wintermute
Wintermute

Reputation: 44043

XML (like structured data in general) shouldn't be handled with plain-text tools like awk and sed except in very special cases because nobody expects XML tools to break if newlines change places or spaces are inserted/removed in benign places.

Instead, I'd use Python, which has an XML parser in its standard library:

#!/usr/bin/python

import xml.etree.ElementTree as ET;
import sys;

# file names taken from command line arguments.
target = ET.parse(sys.argv[1]);
insert = ET.parse(sys.argv[2]);

# Interesting part here:    
target.getroot().find("./storage").append(insert.getroot())

# to write to a file, use target.write('output.xml')
ET.dump(target)

Call that as

python foobar.py fileA fileB

Upvotes: 3

Related Questions