Icaro Americo
Icaro Americo

Reputation: 35

Get Value from file by key file using awk

I'm kinda new to awk, i'm trying to get a value from file using a key in another file.

Value File:

1   39485063845911  RANDOMTEXT    RANDOMNUMBERS
1   39485063845912  RANDOMTEXT    RANDOMNUMBERS
1   39485063845913  RANDOMTEXT    RANDOMNUMBERS
1   39485063845914  RANDOMTEXT    RANDOMNUMBERS

Key File:

1   39485063845911  RANDOMTEXT
1   39485063845912  RANDOMTEXT

I tried to adapt a previous awk that i had, but couldn't get the job done

awk 'BEGIN {FIELDWIDTHS="7 14 3 28 3 25"} NR==FNR {data["0"$14];next} NR!=FNR {FIELDWIDTHS="7 14 3 28"} {if(!($14) in data) {print $0}}' file

The numbers inside FIELDWIDTHS 'represents' the width of the column (both are positional files), and the $14 is the width of the key column

So the output file of the above exemple should be:

1   39485063845911  RANDOMTEXT    RANDOMNUMBERS
1   39485063845912  RANDOMTEXT    RANDOMNUMBERS

Upvotes: 1

Views: 185

Answers (3)

David C. Rankin
David C. Rankin

Reputation: 84551

Or throwing a third possibility into the mix, this can be as easy as a grep -f. For example:

grep -f keyfile valuefile

(note: this requires that the whitespace separating the values between the two files match. If not, then an awk field-based approach is proper)

With use the lines from keyfile to match within valuefile.

Example Use/Output

For your example above:

$ grep -f keyfile valuefile
1   39485063845911  RANDOMTEXT    RANDOMNUMBERS
1   39485063845912  RANDOMTEXT    RANDOMNUMBERS

Upvotes: 2

jhnc
jhnc

Reputation: 16662

Your code, with line-breaks for clarity:

awk '
    BEGIN {FIELDWIDTHS="7 14 3 28 3 25"}
    NR==FNR {data["0"$14];next}
    NR!=FNR {FIELDWIDTHS="7 14 3 28"}
   {if(!($14) in data) {print $0}}
' file
  1. You set FIELDWIDTHS on every line of second (or later) file, rather than just once - this is inefficient
  2. You only read a single file so nothing will ever get printed
  3. You seem to think $14 relates in some way to the field with length 14
  4. You appear to have negated the test you would use if you want to print records from value file that match records in key file (you should do if (x in y) not if (!(x in y)))

Perhaps you need something like:

gawk '
  FNR==1 { FIELDWIDTHS = NR==FNR ? "7 14 3 28" : "7 14 3 28 3 25" }
  NR==FNR { keys[$2]++; next }
  $2 in keys  { print }
' keyfile valuefile

This:

  • only sets FIELDWIDTHS once per input file
  • uses both a key file and a value file
  • refers to field 2 ($2) which seems to be the one you wish to be the key
  • tests for existence rather than absence
  • explicitly uses gawk instead of awk to avoid nasty surprises (if a version that doesn't support the non-POSIX FIELDWIDTHS gets used)

Upvotes: 3

Ed Morton
Ed Morton

Reputation: 203413

I know you're talking about FIELDWIDTHS and character positions in your question but you also said "I'm kinda new to awk" and there are several beginner mistakes in your script so you may not be fully aware of how to use it and given the example you provided all you actually need is:

$ awk 'NR==FNR{a[$2]; next} $2 in a' key values
1   39485063845911  RANDOMTEXT    RANDOMNUMBERS
1   39485063845912  RANDOMTEXT    RANDOMNUMBERS

If that's not all you need then edit your question to provide more realistic sample input/output including cases where the above doesn't work.

Upvotes: 2

Related Questions