durzel
durzel

Reputation: 3

sed: Replace words in pattern with variables across 1 or more lines

I'm writing a bash script that will scan a list of PHP files to look for a specific "class X extends Y" string and replace Y with another string that I have chosen (Z).

So far I've managed to come up with a sed line that works if the entire class X extends Y pattern is on one line, but I can't seem to get it to work where, for example, there is a newline after X, with "extends" following on the next line (in some cases prefaced with a tab/whitespace). I guess it could be possible for "class", X, "extends" and Y to be on 4 separate lines and still be valid PHP? I dunno..

My sed line so far is as follows (this is inside of a loop):

sed -i -E "s/class (\w+) extends $OLDMODEL/\/\/ Modified by `basename "$0"` - `date` - Original model: $OLDMODEL\n\tclass \1 extends $NEWMODEL/" $FILELOC

In the above line $OLDMODEL is Y, $NEWMODEL is Z, and $FILELOC is the absolute path to the file to be changed. When matched, the command inserts a comment above the changed line that records the script name, the date/time the change was made, and the substituted class.

Most of the tutorials I've seen online that deal with multi-line pattern matching seem to use sed -e rather than -E, which is causing me issues with variable substitution and other things that are only working with extended regexp.

X, Y and Z are alphabetic strings with underscores in, e.g. Some_Model_Class_Thing

Would appreciate any help anyone can give me. Thanks in advance :)

I realise I could probably use perl or python, but if it's possible in sed I'd like to be able to get it working, particularly as one-line substitutions are working fine.

Sample input:

 1. class FooModule_Sales_Model_Quote extends BarModule_Some_Other_Class
 2. class FooModule_Sales_Model_Quote
            extends BarModule_Some_Other_Class

Sample output:

 1. class FooModule_Sales_Model_Quote extends New_Module_Better_Class
 2. class FooModule_Sales_Model_Quote extends New_Module_Better_Class

Upvotes: 0

Views: 222

Answers (1)

KamilCuk
KamilCuk

Reputation: 141493

With GNU sed with -z option you could:

sed -z 's@class[[:space:]]\+'"$X"'[[:space:]]\+extends[[:space:]]\+'"$Y"'\([[:space:]{]\)@class '"$X"' extends '"$Z"'\1@g'

Tested on repl:

cat <<EOF >input
1. class FooModule_Sales_Model_Quote extends BarModule_Some_Other_Class
2. class FooModule_Sales_Model_Quote
            extends BarModule_Some_Other_Class

EOF

X=FooModule_Sales_Model_Quote
Y=BarModule_Some_Other_Class
Z=New_Module_Better_Class

<input sed -z 's@'\
'class[[:space:]]\+'"$X"'[[:space:]]\+extends[[:space:]]\+'"$Y"'\([[:space:]{]\)@'\
'class '"$X"' extends '"$Z"'\1@g'

will output:

1. class FooModule_Sales_Model_Quote extends New_Module_Better_Class
2. class FooModule_Sales_Model_Quote extends New_Module_Better_Class

Note that you can preserve the newlines in input, just \([[:space:]]\+\) save them in a backreference and reference in the substituted string.

I also added the \([[:space:]{]\) need for a whitespace or { behind extends $Y class name. There is a possibility that regex will match part of the class, ex. you may want to substitute BarModule_Some for something and BarModule_Some_Other_Class for something different. So that sed doesn't match both in one regex, you need to match the next character after the class name.

Without gnu sed, you could susbtitute newlines for some unreadable character tr '\n' $'\01', run sed and then revert back tr $'\01' '\n'.

Upvotes: 2

Related Questions