Reputation: 41
The issue is that I wish to remove all text between to strings on a line using SED. I understand the use of: sed -i 's/str1.*str2//' file.dat
to remove the text between str1 and str2, inclusive of str1 and str2, but my line has str1 and str2 repeated on the line many times, and I would like to remove the text between each pair. My attempt above removes all text between the first instance of str1 and the last instance of str2. Would appreciate some help in understanding the function to do this.
In addition I would like to repeat this across all lines in the file, and do not know how many times the str1, str2 pair appears on each line. It varies.
Kind Regards
Additional Edit - hope not into a flame-wall!
An example may be of use; Having trouble understanding the answers thus far sorry guys.
For a single line in a file example.dat;
bla.bla.TextOfUnknownLength.bla.bla 1023=3 290=1 336=17 273=07:59:57.833 276=K 278=0 bla.bla.TextOfUnknownLength.bla.bla 1023=20 290=2 336=7 273=07:59:57.833 276=K 278=0 bla.bla.TextOfUnknownLength.bla.bla ...
I wish to remove from 1023= to 278= inclusive (but not the 0 after 278=) in all instances, this text between 1023= and 278= can occur many times in a line and is of unknown length.
There are also many lines in the file, and I would like to run this across all lines.
HS
Upvotes: 4
Views: 8713
Reputation: 955
sed -ri 's/(foo)(.*)(bar)/\1\3/g' between.file
explanation. use regular expressions -r
to match the part before,between and after in your line. then just replace with the prefix \1
and the suffix \2
using sed's internal replacement variables with leading backslashes.
UPDATE:
Consider between.file
contains the following contents.
foo---1---bar
foo---2---bar
foo---3---bar
Then the command above removes the contents between foo
and bar
, so the output looks like
foobar
foobar
foobar
Wasn't that your desired output/change in your file?
UPDATE: I think awk
fits better
for your needs.
Assume the beween.file
contains the following lines
A foo---1---bar B foo---10--bar C
A foo---2---bar D foo---20--bar E
A foo---3---bar B foo---30---bar C
this script
#!/bin/bash
awk '{
all="";
for(i=0; i<=NF; i++) {
if(!($i~/foo.*bar/)) { all=all" "$i; }
};
print all;
}' between.file
will produce the following output
A B C
A D E
A B C
You could use this to implement some kind of DFA to switch into a specific state when reading 1023= and leaving this reading 278=.
Redirect the output to a new file or search the docuMANtation for awk to process directly on a file. hope this helps.
Upvotes: 2
Reputation: 58420
This might work for you (GNU sed):
sed -r ':a;s/([^\n]*)(foo)[^\n]+(bar)/\1\n\2\3/;ta;s/\n//g' file
Use greed, an unique delimiter and a loop to remove characters between foo
and bar
. The greed works backwards through the line and the delimiter stops the part of the line that has been processed from being processed again. The loop removes one or more occurances of foo
through bar
.
Upvotes: 0
Reputation: 10039
just add the g
ath the end of your sed.
sed -i 's/str1.*str2//g' file.dat
Remark with this:
(,{,[,\,&,^,.,..
depending of wanted behaviour.Upvotes: 0