Reputation:

Extra newline coming from somewhere

Can someone explain what I'm doing wrong and how to do it better.

I have a file consisting of records with field separator "-" and record separator "\t" (tab). I want to put each record on a line, followed by the line number, separated by a tab. The input file is called foo.txt.

$ cat foo.txt
a-b-c   e-f-g   x-y-z
$ < foo.txt tr -cd "\t" | wc -c
2
$ wc foo.txt
 1  3 18 foo.txt

My awk script is in the file foo.awk

BEGIN { RS = "\t" ; FS = "-" ; OFS = "\t" }
{
    print $1 "-" $2 "-" $3, NR
}

And here is what I get when I run it:

$ gawk -f foo.awk foo.txt
a-b-c   1
e-f-g   2
x-y-z
    3

The last record is directly followed by a newline, a tab, and the last number. What is going on?

Upvotes: 2

Answers (3)

konsolebox

Reputation: 75458

awk 'BEGIN { RS = "\t"; FS = OFS = "-" } { sub(/\n/, ""); print $0 "\t" NR }' file

Output:

a-b-c   1
e-f-g   2
x-y-z   3

ORS = "\n" was not necessary.

And with GNU Awk or Mawk, you can just have RS = "[\t\n]+":

awk 'BEGIN { RS = "[\t\n]+"; FS = OFS = "-" } { print $0 "\t" NR }' file

Upvotes: 0

martin

Reputation: 3239

There is an newline character at the end of your data that is also output when printing $3.

In particular, it looks like this:

$1 = "x"
$2 = "y"
$3 = "z\n"

You can remove the trailing separator with tr before passing everything to awk:

 tr -d '\n' < foo.txt | awk -f foo.awk

or alternatively add \n to the list of field separators (as shown in the answer by Kent), since awk will strip any separators from the fields.

Upvotes: 1

Kent

Reputation: 195039

well I don't know your exact goal, but since you have built the thing with awk, you can just add \n to FS to reach your goal to remove the trailing \n and without starting another process, like tr, sed or awk

BEGIN { RS = "\t" ; FS = "-|\n" ; OFS = "\t" }

Upvotes: 1

Extra newline coming from somewhere

Answers (3)

Related Questions