geneteics_diva
geneteics_diva

Reputation: 109

adding columns from one file after a column in a second file while considering the two files have different number of columns

File 1

001 00A 892 J27
002 00G 742 M65
003 00B 934 B32
004 00J 876 K57
005 00k 543 N21

File 2 has 1,628,433 columns but, would like to add all four columns from file 1 after column one in this file.

a 2 T ..........
b 3 C ..........
c 4 G ..........
d 5 A ..........
e 6 B ..........

Desired output

a 001 00A 892 J27 2 T ..........
b 002 00G 742 M65 3 C ..........
c 003 00B 934 B32 4 G ..........
d 004 00J 876 K57 5 A ..........
e 005 00k 543 N21 6 B ..........

Tried the following

awk 'NR==FNR{a[FNR]=$1,$2,$3,$4} {print $1,a[FNR],$5}' file2 file1

Upvotes: 1

Views: 488

Answers (5)

geneteics_diva
geneteics_diva

Reputation: 109

awk -F'\t' -v OFS="\t" '{getline f1 < "file1"; $1 = $1 OFS f1; print}' file2

Upvotes: 1

Ed Morton
Ed Morton

Reputation: 203577

$ paste -d' ' <(cut -d' ' -f1 file2) file1 <(cut -d' ' -f2- file2)
a 001 00A 892 J27 2 T ..........
b 002 00G 742 M65 3 C ..........
c 003 00B 934 B32 4 G ..........
d 004 00J 876 K57 5 A ..........
e 005 00k 543 N21 6 B ..........

Upvotes: 3

dawg
dawg

Reputation: 103844

Here is a python that deals with the input files one line at a time:

python3 -c '
import sys
with open(sys.argv[1]) as f1, open(sys.argv[2]) as f2:
    for l1, l2 in zip(f1,f2):
        lf1,lf2=map(str.split, [l1,l2])
        print(" ".join([lf2[0]]+lf1+lf2[1:]))
' file1 file2 

Upvotes: 1

RavinderSingh13
RavinderSingh13

Reputation: 133518

With your shown samples, please try following awk code.

awk 'FNR==NR{arr[FNR]=$1;next} {$1=$1 OFS arr[FNR]} 1' file2 file1

Explanation: Simple explanation would be, using FNR==NR condition when file2 is being read. Create array with index of line number and have 1st field as its value in it. While reading file1 save value of equivalent array of current line into first field then print current line there.

Upvotes: 3

glenn jackman
glenn jackman

Reputation: 246817

This version is lighter on memory: it only reads one line at a time from each file:

awk '{getline f1 < "file1"; $1 = $1 OFS f1; print}' file2

Upvotes: 3

Related Questions