Paul
Paul

Reputation: 431

awk CSV Split with headers Windows

Ok I have a csv file I need to split based on a column value which is fine, but I cannot get the headers to print in each file.

Currently I use:

awk "FS =\",\" {output=$3\".csv\"; print $0 > output}" test.csv

Which splits the files file based on column 3, but I don't know how to add the header to each file.

I've searched high & low but can't find a solution that will work in a one liner...

UPDATE

OK to date we have a working one liner:

awk -F, "NR==1{hdr=$0;next}!($3 in files){files[$3]=1;print hdr>$3\".csv\"}{print>$3\".csv\"}" test.csv

Or in test.awk:

BEGIN{FS=","} NR==1 {hdr=$0;next}!($3 in files) {files[$3]=1;print hdr>$3".csv"}{print>$3".csv"}

Command to run used:

awk -f test.awk test.csv

I really appreciate the help here, I've been trying for hours and have a few things left to work out.

1) Blank line inserted after header 2) Sort the data on specified fields

Further down the line I want to additionally do a row count & cut a reference number from another file is this possible with AWK or am I using the wrong tool for the job?

Thanks again.

Upvotes: 3

Views: 2150

Answers (4)

TrueY
TrueY

Reputation: 7610

UPDATED#2

Blank line after header line

UPDATED

Try this:

On Unix/cygwin (I tested on cygwin):

awk -F, 'NR==1{hdr=$0;next}!($3 in files){files[$3]=1;print hdr"\n">$3".csv"}{print>$3".csv"}' test.csv

Or adding Kent's ideas:

awk -F, 'NR==1{hdr=$0;next}{out=$3".csv"}!($3 in files){files[$3];print hdr"\n">out}{print>out}' test.csv

On windows cmd (not tested):

awk -F, "NR==1{hdr=$0;next}!($3 in files){files[$3]=1;print hdr\"\n\">$3\".csv\"}{print>$3\".csv\"}" test.csv

This stores the header line in test.csv to hdr. For the next lines it checks if the file name value is already exists. If not then stores its name in the files hash and prints the header line. And anyway it prints the whole line to the file.

Example file:

$ cat test.csv 
A,B,C,D
1,2,a,3
4,5,b,4

Output

$ cat a.csv 
A,B,C,D

1,2,a,3

$ cat b.csv 

A,B,C,D
4,5,b,4

ADDED

If You would like to put the awk script into a file You could try (I cannot test is, sorry).

test.awk

BEGIN{FS=","} 
NR==1 {hdr=$0;next}
!($3 in files) {files[$3]=1;print hdr"\n">$3".csv"}
{print>"$3.csv"}

Then You may call it as

awk -f test.awk test.csv

Upvotes: 1

jaypal singh
jaypal singh

Reputation: 77105

Try something like this:

awk -F, '
BEGIN {
    getline header
} 
{
    out=$3".csv"
    if (!($3 in seen)) { 
        print header > out 
    }
    print $0 > out
    seen[$3]
}' test.csv

Windows version: (Not tested)

awk " FS =\",\"
BEGIN {
    getline header
} 
{
    out=$3\".csv\"
    if (!($3 in seen)) { 
        print header > out 
    }
    print $0 > out
    seen[$3]
}" test.csv

Upvotes: 1

Kent
Kent

Reputation: 195079

awk -F, 'NR==1{h=$0;next}{out=$3".csv";
    if!(out in a)print h> out; print $0 > out;a[out]}' test.csv

Upvotes: 1

William Pursell
William Pursell

Reputation: 212248

awk '{ output=$3".csv"; if( !($0 in a)) print "header" > output; a[$0]
    print > output}' FS=, test.csv

Upvotes: 0

Related Questions