Reputation:
Can someone explain what I'm doing wrong and how to do it better.
I have a file consisting of records with field separator "-" and record separator "\t" (tab). I want to put each record on a line, followed by the line number, separated by a tab. The input file is called foo.txt
.
$ cat foo.txt
a-b-c e-f-g x-y-z
$ < foo.txt tr -cd "\t" | wc -c
2
$ wc foo.txt
1 3 18 foo.txt
My awk script is in the file foo.awk
BEGIN { RS = "\t" ; FS = "-" ; OFS = "\t" }
{
print $1 "-" $2 "-" $3, NR
}
And here is what I get when I run it:
$ gawk -f foo.awk foo.txt
a-b-c 1
e-f-g 2
x-y-z
3
The last record is directly followed by a newline, a tab, and the last number. What is going on?
Upvotes: 2
Views: 61
Reputation: 75458
awk 'BEGIN { RS = "\t"; FS = OFS = "-" } { sub(/\n/, ""); print $0 "\t" NR }' file
Output:
a-b-c 1
e-f-g 2
x-y-z 3
ORS = "\n"
was not necessary.And with GNU Awk or Mawk, you can just have RS = "[\t\n]+"
:
awk 'BEGIN { RS = "[\t\n]+"; FS = OFS = "-" } { print $0 "\t" NR }' file
Upvotes: 0
Reputation: 3239
There is an newline character at the end of your data that is also output when printing $3
.
In particular, it looks like this:
$1 = "x"
$2 = "y"
$3 = "z\n"
You can remove the trailing separator with tr
before passing everything to awk
:
tr -d '\n' < foo.txt | awk -f foo.awk
or alternatively add \n
to the list of field separators (as shown in the answer by Kent), since awk
will strip any separators from the fields.
Upvotes: 1
Reputation: 195039
well I don't know your exact goal, but since you have built the thing with awk, you can just add \n
to FS
to reach your goal to remove the trailing \n
and without starting another process, like tr, sed or awk
BEGIN { RS = "\t" ; FS = "-|\n" ; OFS = "\t" }
Upvotes: 1