Reputation: 43
I have a data file with the following format:
aaa 0
bbb 1
ccc 2
ddd ?
eee 0
fff 1
ggg 2
hhh 3
iii ?
...
What I want to do is quite simple: extract and save the parts of the data in different files with the criteria for splitting being only taking the lines between 0 and the '?' so that I would obtain:
output_1.txt >
aaa 0
bbb 1
ccc 2
ddd ?
output_2.txt >
eee 0
fff 1
ggg 2
hhh 3
iii ?
And so on until the end of the input file is reached. I've tried to look into awk command but I'm not quite sure how to specify the conditions nor how to create an output file that depends on the number of times the data is split.
Upvotes: 1
Views: 468
Reputation: 203645
All you need is:
awk 'NR==1 || $NF=="?"{close(out); out="output_"++cnt".txt"} {print > out}' file
The above will work with any awk in any shell on any UNIX system for any size of input file.
If you wanted to do a partial match on ?
(see the comments below) then it'd be either of these:
awk 'NR==1 || index($NF,"?"){close(out); out="output_"++cnt".txt"} {print > out}' file
awk 'NR==1 || $NF~/\?/{close(out); out="output_"++cnt".txt"} {print > out}' file
awk 'NR==1 || $NF~/[?]/{close(out); out="output_"++cnt".txt"} {print > out}' file
Upvotes: 3
Reputation: 246837
You can redirect print statements in awk:
awk -v n=1 '{print > ("output_" n ".txt")} $2 == "?" {n++}' file
If your file is large, you may have to explicitly close the open file:
awk -v n=1 '
{print > ("output_" n ".txt")}
$2 == "?" {close("output_" n ".txt"); n++}
' file
If I was feeling really DRY, I would write
awk -v n=1 '
function filename(n) {return "output_" n ".txt"}
{print > filename(n)}
$2 == "?" {close(filename(n++))} # important, post-increment
' file
Upvotes: 3