olala
olala

Reputation: 4446

Simple awk command issue (FS, OFS related)

I tried to reorganize the format of a file containing:

>Humanl|chr16:86430087-86430726 | element 1 | positive
>Humanl|chr16:85620095-85621736 | element 2 | negative
>Humanl|chr16:80423343-80424652 | element 3 | negative
>Humanl|chr16:80372593-80373755 | element 4 | positive
>Humanl|chr16:79969907-79971297 | element 5 | negative
>Humanl|chr16:79949950-79951518 | element 6 | negative
>Humanl|chr16:79026563-79028162 | element 7 | negative
>Humanl|chr16:78933253-78934686 | element 9 | negative
>Humanl|chr16:78832182-78833595 | element 10 | negative

My command is:

awk '{FS="|";OFS="\t"} {print $1,$2,$3,$4,$5}'

Here is the output:

>Human|chr16:86430087-86430726  |      element 1      |
>Human  chr16:85620095-85621736         element 2      negative
>Human  chr16:80423343-80424652         element 3      negative
>Human  chr16:80372593-80373755         element 4      positive
>Human  chr16:79969907-79971297         element 5      negative
>Human  chr16:79949950-79951518         element 6      negative
>Human  chr16:79026563-79028162         element 7      negative
>Human  chr16:78933253-78934686         element 9      negative
>Human  chr16:78832182-78833595         element 10     negative

Every line works fine except for the first line. I don't understand why this happened.

Can someone help me with it? Thanks!

Upvotes: 28

Views: 63849

Answers (4)

Shiyi Yin
Shiyi Yin

Reputation: 1

I would like to add that

awk '{FS="|";OFS="\t"} {print $1,$2,$3,$4,$5}'

and

awk '{print $1,$2,$3,$4,$5} {FS="|";OFS="\t"}'

have the almost the outputs in my version of awk

with FSOFS first adding 1 extra \t at the end of first line.

@Thor has the best answer https://stackoverflow.com/a/16203497/22188182

#simplest
awk '{print $1,$2}' FS=',' OFS='|'
#most powerful for more complex FS
awk 'BEGIN {FS="|"; OFS="\t"} {print $1, $2}' file
#clean
awk -v FS='|' -v OFS='\t' '{print $1, $2}' file

Upvotes: 0

chan-98
chan-98

Reputation: 97

I know I am late to this but you can also just use the tr command: tr "|" "\t"

Upvotes: 1

Thor
Thor

Reputation: 47099

Short answer

FS and OFS are set too late to affect the first line, use something like this instead:

awk '{print $1,$2,$3,$4,$5}' FS='|' OFS='\t'

You can also use this shorter version:

awk -v FS='|' -v OFS='\t' '$1=$1'

A bit longer answer

It doesn't work because awk has already performed record/field splitting at the time when FS and OFS are set. You can force a re-splitting by setting $0 to $0, e.g.:

awk '{FS="|";OFS="\t";$0=$0} {print $1,$2,$3,$4,$5}'

The conventional ways to do this are 1. set FS and others in the BEGIN clause, 2. set them through the -v VAR=VALUE notation, or 3. append them after the script as VAR=VALUE. My preferred style is the last alternative:

awk '{print $1,$2,$3,$4,$5}' FS='|' OFS='\t'

Note that there is a significant difference between when -v and post-script variables are set. -v will set variables before the BEGIN clause whilst post-script setting of variables are set just after the BEGIN clause.

Upvotes: 45

Kent
Kent

Reputation: 195059

try:

awk 'BEGIN{FS="|";OFS="\t"} {print $1,$2,$3,$4,$5}'

Upvotes: 22

Related Questions