Malgi
Malgi

Reputation: 730

How to concatenate multiple files with same header? Some of the files only have header

I have a huge number of files named 0.file.csv,.., 1000.file.csv. I need to concatenate the files, only keep the header of the first file, and get rid of the headers of the rest of the files. The solution I came up with is:

sudo awk 'FNR==1 && NR!=1{next;}{print}' {0..1000}.file.csv > file.csv

But, this solution does not work if some of the files only have header.

The sample input is:

0.file.csv
person_id, dob, year, subject, degree
0,1984/12/01,2014,math,ms

1.file.csv
person_id, dob, year, subject, degree

2.file.csv
person_id, dob, year, subject, degree
200,1990/03/12,2015,physics,bs
201,1991/04/18,2015,math,ms

The output should be:

person_id, dob, year, subject, degree
0,1984/12/01,2014,math,ms
200,1990/03/12,2015,physics,bs
201,1991/04/18,2015,math,ms

Upvotes: 3

Views: 7260

Answers (3)

Cyrus
Cyrus

Reputation: 88731

With GNU grep:

cat 0.file.csv > file.csv
grep -vh '^person_id, dob, year, subject, degree$' {1..1000}.file.csv >> file.csv

Output to file.csv:

person_id, dob, year, subject, degree
0,1984/12/01,2014,math,ms
200,1990/03/12,2015,physics,bs
201,1991/04/18,2015,math,ms

or with GNU sed and same output:

cat 0.file.csv > file.csv
sed -sn '2,$p' {1..1000}.file.csv >> file.csv

Upvotes: 3

rici
rici

Reputation: 241861

A simpler awk command:

awk 'FNR>1 || NR==1' {0..1000}.file.csv

But this does exactly the same thing as your original (but without the reliance on next). It produces the expected output, but I don't see why your original doesn't. (It did when I tried it.)

Upvotes: 7

SpinUp __ A Davis
SpinUp __ A Davis

Reputation: 5521

Here is an alternate strategy using head and tail:

head -1 0.file.csv > file.csv
tail -qn +2 {0..1000}.file.csv >> file.csv

Contents of file.csv:

person_id, dob, year, subject, degree
0,1984/12/01,2014,math,ms
200,1990/03/12,2015,physics,bs
201,1991/04/18,2015,math,ms

Upvotes: 2

Related Questions