Reputation: 1456
I have three csv files (with the same name, e.g. A_bestInd.csv
) that are located in different subfolders. I want to copy all of them into one file (e.g. All_A_bestInd.csv
). To do that, I did the following:
{ find . -type f -name A_bestInd.csv -exec cat '{}' \; ; } >> All_A_bestInd.csv
The result of this command is the following:
Class Conf 1 2 3 4 //header of file1
A Reduction 5 1 2 1
A Reduction 1 8 1 10
Class Conf 1 2 3 4 //header of file2
A No_red 2 1 3 2
A No_red 3 6 1 9
Class Conf 1 2 3 4 //header of file3
A Reduction 5 5 8 9
A Reduction 7 2 1 11
As you can see, the issue is the header of each file is copied. How can I change my command to keep only one header and avoid the rest?
Upvotes: 4
Views: 3542
Reputation: 50785
Use awk to filter out header lines from all files but the first (except you have thousands of them):
find . -type f -name 'A_bestInd.csv' -exec awk 'NR==1 || FNR>1' {} + > 'All_A_bestInd.csv'
NR==1 || FNR>1
means; if the number of current line from the start of input is 1, or, the number of current line from the start of current file is greater than 1, print current line.
$ cat A_bestInd.csv
Class Conf 1 2 3 4 //header of file3
A Reduction 5 5 8 9
A Reduction 7 2 1 11
$
$ cat foo/A_bestInd.csv
Class Conf 1 2 3 4 //header of file1
A Reduction 5 1 2 1
A Reduction 1 8 1 10
$
$ cat bar/A_bestInd.csv
Class Conf 1 2 3 4 //header of file2
A No_red 2 1 3 2
A No_red 3 6 1 9
$
$ find . -type f -name 'A_bestInd.csv' -exec awk 'NR==1 || FNR>1' {} + > 'All_A_bestInd.csv'
$
$ cat All_A_bestInd.csv
Class Conf 1 2 3 4 //header of file1
A Reduction 5 1 2 1
A Reduction 1 8 1 10
A Reduction 5 5 8 9
A Reduction 7 2 1 11
A No_red 2 1 3 2
A No_red 3 6 1 9
Upvotes: -1
Reputation: 212374
There are solutions with tail +2
and awk
, but it seems to me the classic way to print all but the first line of a file is sed: sed -e 1d
. So:
find . -type f -name A_bestInd.csv -exec sed -e 1d '{}' \; >> All_A_bestInd.csv
Upvotes: 1
Reputation: 361849
Use tail +2
to trim the headers from all the files.
find . -type f -name A_bestInd.csv -exec tail +2 {} \; >> All_A_bestInd.csv
To keep just one header you could combine it with head -1
.
{ find . -type f -name A_bestInd.csv -exec head -1 {} \; -quit
find . -type f -name A_bestInd.csv -exec tail +2 {} \; } >> All_A_bestInd.csv
Upvotes: 2