kyneese
kyneese

Reputation: 11

How to use awk to insert multiple lines after first match of a pattern, in multiple files

I have a directory containing many subdirectories, each containing a config.xml file I want to edit. Like:

../jobs/foo_bar-v1.2_west/config.xml
../jobs/foo_bar-v1.3_west/config.xml
../jobs/foo_stuff-v1.3_east/config.xml
../jobs/foo_foo-v9.8_north/config.xml
../jobs/NOT_FOO-v0.1_whatev/config.xml
etc.

I need a way to insert multiple lines of text into several of the ../jobs/foo*/config.xml files, after matching the first instance of a specific line, <properties>.

Text to insert looks like:

    <a.bunch.of.TextGoesHere>
      <permission>one.foo.Items.Foo:person.name</permission>
      <permission>two.foo.Items.Foo:person.name</permission>
      <permission>three.foo.Items.Foo:person.name</permission>
    </a.bunch.of.TextGoesHere>

Each ../jobs/foo*/config.xml looks like:

<?xml version='1.0' encoding='UTF-8'?>
<foo1>
  <actions/>
  <description>foo2</description>
  <keepDependencies>false</keepDependencies>
  <properties>
    <foo3/>
  </properties>
 ...
  <lots_of_other_stuff>
  <properties>
    <junk>
  </properties>

Final output for each config.xml should look like:

<?xml version='1.0' encoding='UTF-8'?>
<foo1>
  <actions/>
  <description>foo2</description>
  <keepDependencies>false</keepDependencies>
  <properties>
    <a.bunch.of.TextGoesHere>
      <permission>one.foo.Items.Foo:person.name</permission>
      <permission>two.foo.Items.Foo:person.name</permission>
      <permission>three.foo.Items.Foo:person.name</permission>
    </a.bunch.of.TextGoesHere>
    <foo3/>
  </properties>
 ...
  <lots_of_other_stuff>
  <properties>
    <junk>
  </properties>

I've tried using sed to insert after a specific line, like

#!/bin/bash
find ../jobs/run* -name config.xml -exec sed -i '6a\
<text to insert>' {} \;

but occasionally, long <description> text from the config.xml results in an unpredictable line number on which to insert.

Next I tried using sed to search for the first instance of <properties> and inserting after, like

sed -i '0,/properties/a test' config.xml

but this resulted in adding the test test after EVERY line until <properties> was found. Using sed -i '1,/ had similar results. It was ugly.

I'm unsure if I'm using sed properly on this Amazon Linux box, and am thinking awk might work better here. Can anyone assist? Thanks.

Upvotes: 0

Views: 1679

Answers (3)

Ed Morton
Ed Morton

Reputation: 203229

With GNU awk for inplace editing all you need is:

awk -i inplace '
NR==FNR { text = (NR>1 ? text ORS : "") $0 }
FNR==1 { cnt=0 }
{ print }
/<properties>/ && !cnt++ { print text }
' file_containing_text_to_insert ../jobs/foo*/config.xml 

Upvotes: 1

glenn jackman
glenn jackman

Reputation: 246764

Following up on my comment with an answer:

The input xml file "file.xml"

<?xml version='1.0' encoding='UTF-8'?>
<foo1>
  <actions/>
  <description>foo2</description>
  <keepDependencies>false</keepDependencies>
  <properties>
    <foo3/>
  </properties>
 ...
  <lots_of_other_stuff />
  <properties>
    <junk />
  </properties>
</foo1>

The xslt stylesheet "file.xslt"

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
    <!-- Identity transform -->
    <xsl:template match="@* | node()">
        <xsl:copy>
            <xsl:apply-templates select="@* | node()"/>
        </xsl:copy>
    </xsl:template>
    <!-- insert the new stuff before the first child of the first properties element -->
    <xsl:template match="/foo1/properties[1]/*[1]">
        <a.bunch.of.TextGoesHere>
            <permission>one.foo.Items.Foo:person.name</permission>
            <permission>two.foo.Items.Foo:person.name</permission>
            <permission>three.foo.Items.Foo:person.name</permission>
        </a.bunch.of.TextGoesHere>
        <xsl:copy-of select="."/>
   </xsl:template>
</xsl:stylesheet>

The result, using

$ xmlstarlet transform file.xslt file.xml 
<?xml version="1.0"?>
<foo1>
  <actions/>
  <description>foo2</description>
  <keepDependencies>false</keepDependencies>
  <properties>
    <a.bunch.of.TextGoesHere><permission>one.foo.Items.Foo:person.name</permission><permission>two.foo.Items.Foo:person.name</permission><permission>three.foo.Items.Foo:person.name</permission></a.bunch.of.TextGoesHere><foo3/>
  </properties>
 ...
  <lots_of_other_stuff/>
  <properties>
    <junk/>
  </properties>
</foo1>

To apply to all your files:

find . -name config.xml -exec sh -c '
    for xmlfile; do
        xmlstarlet transform xform.xslt "$xmlfile" > "$xmlfile".new &&
        ln "$xmlfile" "$xmlfile".bak &&
        mv "$xmlfile".new "$xmlfile"
    done
' sh {} +

Upvotes: 1

Benjamin W.
Benjamin W.

Reputation: 52112

Assuming the text to insert is in a file called insert:

sed -e '0,/<properties>/{/<properties>/r insert' -e '}' config.xml

The r command reads a file and appends it after the current line; the

0,/pattern/{/pattern/r filename}

makes sure that only the first instance of pattern gets the text appended. Because the command has to end after the filename read by r, it has to be split into two parts using -e.

To edit the files in-place, use sed -i (for GNU sed).

To do this for multiple files, you could use find:

find jobs -name 'config.xml' \
    -exec sed -i -e '0,/<properties>/{/<properties>/r insert' -e '}' {} +

This requires that the insert file is in the directory from which you run this command.


Your commands seemed almost correct, except that you didn't nest a second address into your range to make sure the appending happened just once.

Upvotes: 1

Related Questions