biotech
biotech

Reputation: 727

awk field separator not working for first line

echo 'NODE_1_length_317516_cov_18.568_ID_4005' | awk 'FS="_length" {print $1}'

Obtained output:

NODE_1_length_317516_cov_18.568_ID_4005

Expected output:

NODE_1

How is that possible? I'm missing something.

Upvotes: 15

Views: 8097

Answers (3)

fedorqui
fedorqui

Reputation: 289725

When you are going through lines using Awk, the field separator is interpreted before processing the record. Awk reads the record according the current values of FS and RS and goes ahead performing the operations you ask it for.

This means that if you set the value of FS while reading a record, this won't have effect for that specific record. Instead, the FS will have effect when reading the next one and so on.

So if you have a file like this:

$ cat file
1,2 3,4
5,6 7,8

And you set the field separator while reading one record, it takes effect from the next line:

$ awk '{FS=","} {print $1}' file
1,2                             # FS is still the space!
5

So what you want to do is to set the FS before starting to read the file. That is, set it in the BEGIN block or via parameter:

$ awk 'BEGIN{FS=","} {print $1}' file
1,2                             # now, FS is the comma
5
$ awk -F, '{print $1}' file
1
5

There is also another way: make Awk recompute the full record with {$0=$0}. With this, Awk will take into account the current FS and act accordingly:

$ awk '{FS=","} {$0=$0;print $1}' file
1
5

Upvotes: 21

Harshit Anand
Harshit Anand

Reputation: 702

awk Statement used incorrectly

Correct way is

awk 'BEGIN { FS = "#{delimiter}" } ; { print $1 }'

In your case you can use

awk 'BEGIN { FS = "_length" } ; { print $1 }'

Upvotes: 3

riteshtch
riteshtch

Reputation: 8769

Inbuilt variables like FS, ORS etc must be set within a context i.e in 1 of the following blocks: BEGIN, condition blocks or END.

$ echo 'NODE_1_length_317516_cov_18.568_ID_4005' | awk 'BEGIN{FS="_length"} {print $1}'
NODE_1
$

You can also pass the delimiter using -F switch like this:

$ echo 'NODE_1_length_317516_cov_18.568_ID_4005' | awk -F "_length" '{print $1}'
NODE_1
$

Upvotes: 0

Related Questions