Appending text to column in existing tab-delimited file from another file

Question

In one tab-delimited file, it takes the form of:

ID | integer | integer

1   12345    56789
2   11111    233323
3   43423    789798

Basically file1.txt is a tab-delimited file where the first column is the primary key of a sql database numbered 1-50. The other two columns are just numbers that correspond to our data set.

Then in another file I have: string | integer

In file2.txt , the ID is a string where the digit is the corresponding ID in file1.txt. I want to append the integer data from file2.txt to file1.txt.
I want the output to be

1   12345    56789    234234
2   11111    233323   987987
3   43423    789798   234234

Because of the way the data was originally created in the file2 database, the string is the ID. So when we order it, it's c1,c10,c11,c12...c2,c20,c21,c22.... etc. I want to line up the numbers with the first file and append the data. How can I do this? I'm pretty noob to bash and only found the paste command if my columns already lined up perfectly. Thanks.

Steve · Accepted Answer

One way using GNU awk:

awk 'BEGIN { FS=OFS="	" } FNR==NR { array[substr($1, 2)]=$2; next } { if ($1 in array) print $0, array[$1] }' file2.txt file1.txt

Results:

1   12345   56789   234234
2   11111   233323  987987
3   43423   789798  234234

Edit:

Perhaps the string in file2.txt could contain many non-digit characters. If this is the case, then try the following:

awk 'BEGIN { FS=OFS="	" } FNR==NR { gsub(/[^0-9]+/, "", $1); array[$1]=$2; next } { if ($1 in array) print $0, array[$1] }' file2.txt file1.txt

Results:

1   12345   56789   234234
2   11111   233323  987987
3   43423   789798  234234

Appending text to column in existing tab-delimited file from another file

Answers (1)

Related Questions