AWK: search substring in first file against second

Question

I have the following files:

data.txt

Estring|0006|this_is_some_random_text|more_text
Fstring|0010|random_combination_of_characters
Fstring|0028|again_here

allids.txt (here the columns are separated by semicolon; the real input is tab-delimited)

Estring|0006;MAR0593
Fstring|0002;MAR0592
Fstring|0028;MAR1195

please note: data.txt: the important part is here the first two "columns" = name|number)

Now I want to use awk to search the first part (name|number) of data.txt in allids.txt and output the second column (starting with MAR)

so my expected output would be (again tab-delimited):

Estring|0006|this_is_some_random_text|more_text;MAR0593
Fstring|0010|random_combination_of_characters
Fstring|0028|again_here;MAR1195

I do not know now how to search that first conserved part within awk, the rest should then be:

awk 'BEGIN{FS=OFS="	"} FNR == NR { a[$1] = $1; next } $1 in a { print a[$0], [$1] }' data.txt allids.txt

hek2mgl · Accepted Answer

I would use a set of field delimiters, like this:

awk -F'[|	;]' 'NR==FNR{a[$1"|"$2]=$0; next}
                $1"|"$2 in a {print a[$1"|"$2]"	"$NF}' data.txt allids.txt

In your real-data example you can remove the ;. It is in here just to be able to reproduce the example in the question.

Answers (2)