Reputation: 1255
I am trying to extract blocks in a text file and put them to new individual files. For example, consider the following file:
some junk lines
ABC: this is abc text
abc block text1
abc block text2
abc block text3
I dont care about this line
Text at start of block. I dont want this line also.
ABC: this is another abc text
abc block text5
abc block text2
abc block text3
abc block text1
some other dont care line
I am interested in 'ABC' blocks. Every block has "ABC:" at beginning and new line at the end. So, I want to generate abc1.txt that contains:
ABC: this is abc text
abc block text1
abc block text2
abc block text3
and abc2.txt that contains:
ABC: this is another abc text
abc block text5
abc block text2
abc block text3
abc block text1
I tried using awk to get the blocks but having hard time in matching ending new line.
One option is to write a script that loops through each and every line in the file. I believe there is a better solution. Can someone please help? Thanks in advance!
Upvotes: 0
Views: 685
Reputation: 204446
Your problem of blocks of text separated by blank lines is exactly what awks "paragraph mode" exists to handle and is activated by setting RS to the null string:
awk -v RS= '/^ABC:/{print > ("abc"++c".txt")}' file
The above will work if you don't have a lot of output files or if you're using GNU awk since it handles closing files for you when necessary. If you do have a lot of output files but can't get GNU awk then you just need to tweak it to:
awk -v RS= '/^ABC:/{close(f); f="abc"++c".txt"; print > f}' file
Upvotes: 1
Reputation: 195229
This one-liner should do the job:
awk '/^ABC/{p=1;close(fn);fn="abc"++i}!NF{p=0}p{print > fn}' file
With your example as input:
kent$ awk '/^ABC/{p=1;close(fn);fn="abc"++i}!NF{p=0}p{print > fn}' f
kent$ head abc*
==> abc1 <==
ABC: this is abc text
abc block text1
abc block text2
abc block text3
==> abc2 <==
ABC: this is another abc text
abc block text5
abc block text2
abc block text3
abc block text1
close(fn)
is necessary, if you have many "ABC" blocks, otherwise you got error msgs like "too many opened files"Upvotes: 4
Reputation: 12887
awk '/^ABC:/,/^$/' filename
Search for all lines starting with ABC: (^ for the start of the line) to any thing with a blank line (^$) Use the , to lines to and from.
Upvotes: -2