Reputation: 11834
I have text which spans multiple lines
... someabove text
jpqpq====== mcvnmcv
.... s;ql[[pw]]
<<<<<< uyuuey
... middle text
jhasjh ======dsadsas
.... grqywtrt
klklk <<<<<<alallal
... someend text
I want to remove all the text from ====== till <<<<<<
In sublime text
i use
find: (?s)(======(?:(?!======).)*?<<<<<<)
replace :
and all the occurences are removed and output looks
... someabove text
jpqpq uyuuey
... middle text
jhasjh alallal
... someend text
Now i want to do this using command line using sed or awk or anything
. Because everytime to open the file and do replace is tedious
But i searched for sed and awk, i found that they dont support non zero regex. and perl is used in these cases
Can someone confirm that sed and awk cant use such patterns like this (======(?:(?!======).)*?<<<<<<)
and have to try some indirect ways.
Still i am looking for how to do this with sed and awk (even indirect) and also perl (if lookahead is allowed)
with perl also it didnt work
perl -ne 's/"(======(?:(?!======).)*?<<<<<<)"/""/g; print' file
blank output
Upvotes: 2
Views: 1641
Reputation:
if no <
character within ===== till <<<<< in data 'd' file, tried on gnu sed
sed -Ez 's/={6}[^<]*<{6}//g' d
Upvotes: 0
Reputation: 204015
Right you don't get looka-whatever with sed or awk but you also don't need it, it's just syntactic sugar. With GNU awk for multi-char RS:
$ awk -v RS='<<<<<<' -v ORS= 'RT{sub(/======.*/,"")} 1' file
... someabove text
jpqpq uyuuey
... middle text
jhasjh alallal
... someend text
and with GNU sed for -z
:
$ sed -z 's/@/@A/g; s/{/@B/g; s/}/@C/g; s/======/{/g; s/<<<<<</}/g;
s/{[^{}]*}//g;
s/}/<<<<<</g; s/======/{/g; s/@C/}/g; s/@B/{/g; s/@A/@/g
' file
... someabove text
jpqpq uyuuey
... middle text
jhasjh alallal
... someend text
Upvotes: 0
Reputation: 3380
Yes, neither awk nor sed support lookarounds. More specifically, the regex flavors they use don't support them.
Your perl command failed because you need to tell it that this is a multiline string (the s
) modifier. But that would still fail because perl
reads input line by line, and would apply the replacement operator to each line. If you want it to match across the entire file, you need to slurp it with -0777
. This does what you need:
$ perl -0777pe 's/======.*?<<<<<<//gs' file
... someabove text
jpqpq uyuuey
... middle text
jhasjh alallal
... someend text
The -0777
causes perl to slurp the entire file. The -p
makes it print each line and the -e
gives it what you want it to do. I also simplified your regex since there seems no reason to use such a complex approach. ======.*?<<<<<<
will match ======
, then the .*?<<<<<<
means "as few characters as possible until the <<<<<<
. Finally, the /sg
at the end will activate multiline strings (s
, allowing the .
to match newlines) and will make the replacement operator work globally (g
) so it will replace all occurrences.
In sed
, if your markers were on lines by themselves, that is if you wanted to delete everything on the ======
and <<<<<<
lines, you could do this:
$ sed '/======/,/<<<<<</d' file
... someabove text
... middle text
... someend text
But that wont' work for you here.
Upvotes: 2