Reputation: 205
I would need to use regex in a bash
script to substitute text in a file that might be on multiple lines.
I would pass s
as flag in other regex engines that I know but I have a hard time for bash.
sed
as far as I know doesn't support this feature.
perl
it obviously does but I can not make it work in a one liner
perl -i -pe 's/<match.+match>//s $file
example text:
DONT_MATCH
<match some text here
and here
match>
DONT_MATCH
Upvotes: 2
Views: 1003
Reputation: 58391
This might work for you (GNU sed):
sed '/^<match/{:a;/match>$/!{N;ba};s/.*//}' file
Gather up a collection of lines from one beginning <match
to one ending match>
and replace them by nothing.
N.B. This will act on all such collections throughout the file and the end-of-file condition will not effect the outcome. To only act on the first, use:
sed '/^<match/{:a;/match>$/!{N;ba};s/.*//;:b;n;bb}' file
To only act on the second such collection use:
sed -E '/^<match/{:a;/match>$/!{N;ba};x;s/^/x/;/^(x{2})$/{x;s/.*//;x};x}' file
The regex /^(x{2})$/
can be tailored to do more intricate matching e.g. /^(x|x{3,6})$/
would match the first and third to sixth collections.
Upvotes: 2
Reputation: 385685
By default, .
doesn't match a line feed. s
simply makes .
matches any character.
You are reading the file a line at a time, so you can't possibly match something that spans multiple lines. Use -0777
to treat the entire input as a one line.
perl -i -0777pe's/<match.+match>//s' "$file"
Upvotes: 4
Reputation: 113834
With GNU sed:
$ sed -z 's/<match.*match>//g' file
DONT_MATCH
DONT_MATCH
With any sed:
$ sed 'H;1h;$!d;x; s/<match.*match>//g' file
DONT_MATCH
DONT_MATCH
Both the above approaches read the whole file into memory. If you have a big file (e.g. gigabytes), you might want a different approach.
With GNU sed, the -z
option reads in files with NUL as the record separator. For text files, which never contain NUL, this has the effect of reading the whole file in.
For ordinary sed, the whole file can be read in with the following steps:
H
- Append current line to hold space1h
- If this is the first line, overwrite the hold space
with it$!d
- If this is not the last line, delete pattern space
and jump to the next line.x
- Exchange hold and pattern space to put whole file in
pattern spaceUpvotes: 1