binhex
binhex

Reputation: 394

sed find regex pattern then find next regex pattern (variable multi line) and replace

I am trying to find a way to insert a string ' .sh' into a particular line in a text file using sed. The problem i have is that i only want to insert the string if a previous regex matches, it also has to support a variable number of lines between the 'name' and 'extension' tag and to be idempotent, so i can run it multiple times with only a single insertion of ' .sh' for the 'extension' tagged space separated list.

Here is a small snippet of the text file:-

    <name>gba</name>
    <fullname>Game Boy Advance</fullname>
    <manufacturer>Nintendo</manufacturer>
    <release>2001</release>
    <hardware>portable</hardware>
    <path>/storage/roms/gba</path>
    <extension>.gba .GBA .zip .ZIP .7z .7Z</extension>

so i want to change <extension>.gba .GBA .zip .ZIP .7z .7Z</extension> to <extension>.gba .GBA .zip .ZIP .7z .7Z .sh</extension> ONLY if the name tag is <name>gba</name>

This is the best i have come up with so far, but it has two problems, firstly subsequent executions of the code duplicate the insertion and secondly its for a fixed number of lines (6), which may not always be the case:-

sed -i '/<name>gba<\/name>/{n;n;n;n;n;n;s/<\/extension>/ .sh<\/extension>/}' /tmp/test.txt

Upvotes: 2

Views: 117

Answers (3)

potong
potong

Reputation: 58420

This might work for you (GNU sed):

sed '/<name>gba<\/name>/{:a;n;/<\/extension>/!ba;/\.sh/!s/<\/extension/ .sh&/}' file

Focus on a line containing <name>gba</name>.

Print the current line, fetch the next and if that line does not contain </extension>, repeat.

Otherwise, if the current line does not contain .sh already, prepend it to the above string.

Upvotes: 0

RavinderSingh13
RavinderSingh13

Reputation: 133518

In case you are ok with awk, could you please try following. Written and tested with shown samples.

awk '
/<name>/{ found="" }
/<name>gba<\/name>/{
  found=1
}
found && /<extension>/ && !/\.sh</{
  sub(/<\//," .sh&")
  found=""
}
1
'  Input_file

Explanation: Adding detailed explanation for above.

awk '                     ##Starting awk program from here.
/<name>/{ found="" }
/<name>gba<\/name>/{      ##Searching string <name>gba</name>
  found=1                 ##Setting found to 1 here.
}
found && /<extension>/ && !/\.sh</{   ##Checking found is set and <extension> is found in line then do following.
  sub(/<\//," .sh&")      ##Substituting </ with space .sh and matched value in current line.
  found=""                ##Nullifying found here.
}
1                         ##Printing current line here.
'  Input_file             ##Mentioning Input_file here.

Upvotes: 2

Sundeep
Sundeep

Reputation: 23667

Tested with GNU sed (syntax might vary for other implementations):

sed '/<name>gba</,/<extension>/{/<extension>/{/\.sh/! s/<\/extension>/ .sh&/}}'
  • /<name>gba</,/<extension>/ this will match range of lines starting with a line containing <name>gba< and ending with a line containing <extension>
    • this is based on given sample, you can modify the regex if you need more robust matching condition
  • {} helps to group commands to be executed only for the given matching conditions
  • for such range of lines:
    • /<extension>/ match only this line
    • /\.sh/! check if it doesn't have .sh
    • s/<\/extension>/ .sh&/ add .sh for such lines

Upvotes: 3

Related Questions