Reputation: 135
I have a requirement. let say, below i have as input in file file1.txt
start
asfsafsf
faffsa
gygfyt
end1
dddadd
start
afsaf
safsaf
asdasd
start
asda
DD
end2
aasfsa
afaf
start
dada
afaf
asfs
end3
fafaf
I need to capture string between start & end3, then output expected as,
start
dada
afaf
asfs
end3
If i need to capture for end2, then i need output as,
start
asda
DD
end2
Can some one help me in awk command as sed is slower.
Upvotes: 1
Views: 1168
Reputation: 207738
You can do it quite legibly like this:
awk '/start/{out=$0;next} /end3/{out=out RS $0;print out;out=""}{if(length(out))out=out RS $0}' file
So, if we see the word start
we set the output string to the current line and move to next line. If we have reached end3
(you can change it to end2
), we print the accumulated output. On all other lines, if we have started accumulating an output line, we add the current line after a linefeed character.
If you have lots of files and you want to parse them in parallel, you can use GNU Parallel, like this:
parallel -q awk '/start/{out=$0;next} /end3/{out=out RS $0;print out;out=""}{if(length(out))out=out RS $0}' ::: *.txt
Upvotes: 2
Reputation: 14965
Reverse the input file will do the trick:
$ tac infile|awk '/end3/{f=1}f;/start/{f=0}'|tac
For multiple files use:
$ tac files*|awk '/end3/{f=1}f;/start/{f=0}'|tac
Upvotes: 1
Reputation: 174806
Through perl,
$ perl -0777pe 's/.*(?:^|\n)(start(?:(?!start|end3).)*\nend3)(?:\n|$).*/\1\n/s' f
start
dada
afaf
asfs
end3
$ perl -0777pe 's/.*(?:^|\n)(start(?:(?!start|end2).)*\nend2)(?:\n|$).*/\1\n/s' f
start
asda
DD
end2
Upvotes: 0
Reputation: 195209
this awk oneliner does it no matter the start - endx
paired or not
awk -v n="2" 'NR==FNR{a[$0]=NR;if($0~"end"n){s=a["start"];e=a["end"n];nextfile}}
FNR>=s&&FNR<=e' file file
change the -v n="2"
to a variable to make it dynamic.
Upvotes: 0