Einsiedler
Einsiedler

Reputation: 43

Extract data and save in different output files

I have a data file with the following format:

aaa     0
bbb     1
ccc     2
ddd     ?
eee     0
fff     1
ggg     2
hhh     3
iii     ?
   ...

What I want to do is quite simple: extract and save the parts of the data in different files with the criteria for splitting being only taking the lines between 0 and the '?' so that I would obtain:

output_1.txt >

aaa     0
bbb     1
ccc     2
ddd     ?

output_2.txt >

eee     0
fff     1
ggg     2
hhh     3
iii     ?

And so on until the end of the input file is reached. I've tried to look into awk command but I'm not quite sure how to specify the conditions nor how to create an output file that depends on the number of times the data is split.

Upvotes: 1

Views: 468

Answers (2)

Ed Morton
Ed Morton

Reputation: 203645

All you need is:

awk 'NR==1 || $NF=="?"{close(out); out="output_"++cnt".txt"} {print > out}' file

The above will work with any awk in any shell on any UNIX system for any size of input file.

If you wanted to do a partial match on ? (see the comments below) then it'd be either of these:

awk 'NR==1 || index($NF,"?"){close(out); out="output_"++cnt".txt"} {print > out}' file

awk 'NR==1 || $NF~/\?/{close(out); out="output_"++cnt".txt"} {print > out}' file

awk 'NR==1 || $NF~/[?]/{close(out); out="output_"++cnt".txt"} {print > out}' file

Upvotes: 3

glenn jackman
glenn jackman

Reputation: 246837

You can redirect print statements in awk:

awk -v n=1 '{print > ("output_" n ".txt")} $2 == "?" {n++}' file

If your file is large, you may have to explicitly close the open file:

awk -v n=1 '
    {print > ("output_" n ".txt")} 
    $2 == "?" {close("output_" n ".txt"); n++}
' file

If I was feeling really DRY, I would write

awk -v n=1 '
    function filename(n) {return "output_" n ".txt"} 
    {print > filename(n)} 
    $2 == "?" {close(filename(n++))}  # important, post-increment
' file

Upvotes: 3

Related Questions