Leonardo Mecacci
Leonardo Mecacci

Reputation: 11

Split csv file in bash accord to column value

I've a CSV file like this:

file1

field1 field2 field3 field4 
000000 nodata NP1212 1.1212
000000 212344 NP1212 1.1232  
000000 343423 NX3212 1.2342  
000000 542346 NX3212 1.6345  
000000 nodata NZ3244 1.0345 

It is tab delimited, with about 70000 rows, with 23 field and sorted by field3. How can I split it by rows to create this kind of outputs?

file1_NP1212

field1 field2 field3 field4  
000000 nodata NP1212 1.1212  
000000 212344 NP1212 1.1232  

file1_NX3212

field1 field2 field3 field4    
000000 343423 NX3212 1.2342  
000000 542346 NX3212 1.6345  

file1_NZ3244

field1 field2 field3 field4   
000000 nodata NZ3244 1.0345 

Upvotes: 1

Views: 189

Answers (2)

anubhava
anubhava

Reputation: 785156

This awk command should do the trick:

awk 'NR==1{hdr=$0; next}
     {fn="file1_" $3; if (p != $3) {close(p); p=$3; print hdr > fn} print > fn}
     END {close(p)}' file1

Upvotes: 2

tripleee
tripleee

Reputation: 189397

If you want to keep the header line in every output file and be robust against running out of file handles, try this.

awk -F '\t' 'NR == 1 { header=$0; next }
    $3 != prev { close(handle); prev = $3; handle = "file1_" $3; print header >handle }
    { print >handle }' file1

This assumes the input is already sorted on field 3; if not, you will end up overwriting files with partial output.

Upvotes: 0

Related Questions