Reputation: 135

Multi Line AWK capture

I have a requirement. let say, below i have as input in file file1.txt

start
asfsafsf
faffsa
gygfyt
end1
dddadd
start
afsaf
safsaf
asdasd
start
asda
DD
end2
aasfsa
afaf
start
dada
afaf
asfs
end3
fafaf

I need to capture string between start & end3, then output expected as,

start
dada
afaf
asfs
end3

If i need to capture for end2, then i need output as,

start
asda
DD
end2

Can some one help me in awk command as sed is slower.

Upvotes: 1

Answers (4)

Mark Setchell

Reputation: 207738

You can do it quite legibly like this:

awk '/start/{out=$0;next} /end3/{out=out RS $0;print out;out=""}{if(length(out))out=out RS $0}' file

So, if we see the word start we set the output string to the current line and move to next line. If we have reached end3 (you can change it to end2), we print the accumulated output. On all other lines, if we have started accumulating an output line, we add the current line after a linefeed character.

If you have lots of files and you want to parse them in parallel, you can use GNU Parallel, like this:

parallel -q awk '/start/{out=$0;next} /end3/{out=out RS $0;print out;out=""}{if(length(out))out=out RS $0}' ::: *.txt

Upvotes: 2

Juan Diego Godoy Robles

Reputation: 14965

Reverse the input file will do the trick:

$ tac infile|awk '/end3/{f=1}f;/start/{f=0}'|tac

For multiple files use:

$ tac files*|awk '/end3/{f=1}f;/start/{f=0}'|tac

Upvotes: 1

Avinash Raj

Reputation: 174806

Through perl,

$ perl -0777pe 's/.*(?:^|\n)(start(?:(?!start|end3).)*\nend3)(?:\n|$).*/\1\n/s' f
start
dada
afaf
asfs
end3
$ perl -0777pe 's/.*(?:^|\n)(start(?:(?!start|end2).)*\nend2)(?:\n|$).*/\1\n/s' f
start
asda
DD
end2

Upvotes: 0

Kent

Reputation: 195209

this awk oneliner does it no matter the start - endx paired or not

awk -v n="2" 'NR==FNR{a[$0]=NR;if($0~"end"n){s=a["start"];e=a["end"n];nextfile}}
                      FNR>=s&&FNR<=e' file file

change the -v n="2" to a variable to make it dynamic.

Upvotes: 0

Multi Line AWK capture

Answers (4)

Related Questions