Mojo Dodo
Mojo Dodo

Reputation: 67

Combine csv-files by columns with awk

I have several csv-files that I want to combine columnwise, i.e. content of each files should be appended as a new column (not in new lines). I only need the content beginning with line 3 (which I solved with FNR>2) and column 1-3 in the order 1,3,2 (which I do with print $1,$3,$2). However, the files may contain a different number of lines for each column.

file 1.csv

bla bla bla
bla bla bla
Heading1;Heading2;Heading3;Heading4
11;12;13;useless
21;22;23;useless
31;32;33;useless

file 2

bla bla bla
bla bla bla
Heading1;Heading2;Heading3;Heading4
11;12;13;useless
21;22;23;useless
31;32;33;useless
41;42;43;useless

the output should be:

Heading1;Heading3;Heading2;Heading1;Heading3;Heading2
11;13;12;11;13;12
21;23;22;21;23;22
31;33;32;31;33;32
;;;41;43;42

This, of course, gives me the desired output, but with each output appended as a new line:

awk 'BEGIN {FS=";"; OFS = ";"} FNR>2 {print $1,$3,$2}' path/to/files/*csv

this appends columns, but does, of course, not insert any new lines:

awk 'BEGIN {FS=";"; OFS = ";"} FNR>2 {printf "%s%s%s%s%s", $1,OFS,$3,OFS,$2}' path/to/files/*csv

and this is the same as my above command with print:

awk 'BEGIN {FS=";"; OFS = ";"} FNR>2 {printf "%s%s%s%s%s", $1,OFS,$3,OFS,$2"\n"}' path/to/files/*csv

Thanks already for your suggestions!

Upvotes: 0

Views: 467

Answers (1)

Ed Morton
Ed Morton

Reputation: 203229

$ cat tst.awk
BEGIN { FS=OFS=";" }
FNR==1 { numFiles++ }
{ rowNr = FNR - 2 }
rowNr > 0 {
    rows[rowNr,numFiles] = $0
    numRows = (numRows > rowNr ? numRows : rowNr)
}
END {
    for (rowNr=1; rowNr<=numRows; rowNr++) {
        sep = ""
        for (fileNr=1; fileNr<=numFiles; fileNr++) {
            split(rows[rowNr,fileNr],row)
            printf "%s", sep row[1] OFS row[3] OFS row[2]
            sep = OFS
        }
        print ""
    }
}

$ awk -f tst.awk file1.csv file2.csv
Heading1;Heading3;Heading2;Heading1;Heading3;Heading2
11;13;12;11;13;12
21;23;22;21;23;22
31;33;32;31;33;32
;;;41;43;42

With GNU awk you can use ARGIND instead of setting/using numFiles.

Upvotes: 2

Related Questions