Elabore
Elabore

Reputation: 151

How to interleave columns from two files?

Let's say we have two files (same sized m*n matrices), with columns:

A1, A2, A3, A4, ..., An 

and

B1, B2, B3, B4, ..., Bn 

The expected output would be:

A1, B1, A2, B2, A3, B3, A4, B4, ..., An, Bn 

How can this be done? I guess that there are some awk one-liners, but I haven't been able to build the right one...

Upvotes: 0

Views: 486

Answers (6)

ssanch
ssanch

Reputation: 425

One more solution using awk and paste wouldn't hurt.

paste -d',' file1 file2 | awk -F ',' '{ 
    z = ""
    for (i=1; i <= NF/2; ++i){ 
        x = i+(NF/2)
        y = $i","$x
        z = z","y
    }
    print substr(z,2,length(z))
}' 

First, paste -d',' file1 file2 both files under the same row indices, merging them using the same field-separator = ,.

Then, in awk, -F ',' takes the commas as field-separator and loops through 1/2 the column indices i <= NF/2, finding the corresponding column to interleave x = i+(NF/2) and creating a new comma-separated string with both values y = $i","$x. Finally, these new strings are concatenated into an empty string z = z","y and printed after the loop excluding the first comma print substr(z,2,length(z)).

Personally, I found this a bit more explicit than some of the previous solutions using awk.

Upvotes: 0

ghoti
ghoti

Reputation: 46826

If, as your input suggests, you're only using a single line of each input, then processing by record might be easier than processing by field. You can read one file through stdin, and read the other file explicitly.

As a one liner, this might look like this:

awk 'BEGIN {ORS=RS=","} {print $1; getline < "f2"; print $1}' f1; echo

Broken out for easier reading with comments:

awk '
  BEGIN { ORS=RS="," }     # record separator is a comma!
  {
    print $1               # print a trimmed (1-field) record from the first file,
    getline < "file2"      # then get the next record from the second file.
    print $1               # print a record from the second file.
  }
' file1
echo                       # print a newline, since awk didn't.

If you'd prefer your output to have spaces after the comma you could replace the code in the BEGIN block with:

  BEGIN {RS=","; ORS=", "}

Upvotes: 0

RomanPerekhrest
RomanPerekhrest

Reputation: 92854

paste + tr + sed trick for Unix shell:

file1 contents:

A1, A2, A3, A4, A5, A6, A7

file2 contents:

B1, B2, B3, B4, B5, B6, B7

paste <(tr ',' '\n' <file1) <(tr ',' '\n' <file2) | paste -s | sed 's/[[:space:]]\+/, /g'

The output:

A1, B1, A2, B2, A3, B3, A4, B4, A5, B5, A6, B6, A7, B7

Upvotes: 0

James Brown
James Brown

Reputation: 37394

Using tr and rs (reshape a data array), if available. If not, talk to your local admin or hack the planet. First, test data:

$ cat foo bar
a1,a2,a3
b1,b2,b3

send that to tr replacing , with space:

$ cat foo bar | tr , ' '
a1 a2 a3
b1 b2 b3

and on to rs for transposing:

$ cat foo bar | tr , ' ' | rs -T
a1  b1
a2  b2
a3  b3

and finally to another rs to squeeze previous on one line:

$ cat foo bar | tr , ' ' | rs -T | rs 1
a1  b1  a2  b2  a3  b3

Last rs could be replaced with tr \n' ' '. rs honors delimiters for inputing and outputing, see the man page for that. I left the commas out intentionally.

Upvotes: 0

Ed Morton
Ed Morton

Reputation: 203254

awk '
    BEGIN { FS=OFS=", " }
    NR==FNR { a[NR]=$0; next }
    {
        split(a[FNR],f)
        for (i=1;i<=NF;i++) {
            printf "%s%s%s%s", f[i], OFS, $i, (i<NF?OFS:ORS)
        }
    }
' a.txt b.txt

Upvotes: 2

George Vasiliou
George Vasiliou

Reputation: 6335

Something like this seems ok in my tests, considering that both files have the same number of lines and fields = same array dimensions:

$ cat file1
a1,a2,a3
a4,a5,a6

$ cat file2
b1,b2,b3
b4,b5,b6

$ awk 'NR==FNR{f1[FNR]=$0;next};{split(f1[FNR],ff1,",");split($0,ff2,","); \
for (f=1;f<=length(ff1);f++) printf ff1[f]","ff2[f](f!=length(ff1)?",":"\n")}' file1 file2
a1,b1,a2,b2,a3,b3
a4,b4,a5,b5,a6,b6

Quick explanation:
awk reads first the one file and then the second file.
NR==FNR{f1[FNR]=$0;next} : read the first file and create an array f1 with indeces the line number of file1 and contents the whole line $0

When the first file is finished then the rest of the code is executed during processing of file2:

split(f1[FNR],ff1,",") : Since both files have the same amount of lines, this ones splits previous read records from file1 (stored in array f1) into a new array ff1 by using comma as split delimiter.

split($0,ff2,",") : Similarily, this splits $0 = current record / current line of file2 into an array with name ff2, using comma as delimiter.

for (f=1;f<=length(ff1);f++) printf ff1[f]","ff2[f](f!=length(ff1)?",":"\n")
This one iterates through the array elements of ff1 (ff1 has the same length of ff2) and prints data from both ff1 and ff2.

(f!=length(ff1)?",":"\n") : This prints comma , while we have not reached the end of array ff1/ff2 , otherwise prints a newline character \n

Upvotes: 1

Related Questions