pdoak
pdoak

Reputation: 731

Awk output removes comma

If I have the following csv file called mycsv.txt:

0123, fred, 012345, end
023, smith, 012, end

and apply this awk command:

awk '{$1=sprintf("%05d", $1);$3=sprintf("%08d", $3)}1' mycsv.txt 

I get this output:

00123 fred, 00012345 end
00023 smith, 00000012 end

Why is the first and third comma removed and how do I make sure that they are included in the output.

Upvotes: 2

Views: 609

Answers (3)

Ed Morton
Ed Morton

Reputation: 203219

There's 2 things happening:

  1. If you don't specify a field separator (e.g. FS=",") then awk will use chains of white space so then your first field, $1, of your first input line is 0123, rather than 0123 and
  2. When you perform a numeric operation on a string, awk strips all non-digits off the right side of that string and leading zeros off the left to turn it into a number so then 0123, becomes 123 (and 000173foo would become 173).

So $1 is 0123, and therefore:

sprintf("%05d", $1) = sprintf("%05d", "0123,") = sprintf("%05d", "123") = 00123

which when you assign that result to $1 replaces 0123, with 00123 hence the vanishing ,.

This is what you really wanted:

awk '
    BEGIN { FS="[[:space:]]*,[[:space:]]*"; OFS=", " }
    { $1=sprintf("%05d", $1); $3=sprintf("%08d", $3) }
1' mycsv.txt

The above will accept input with any white space around the field-separating ,s and will ensure the output fields are all separated by exactly 1 comma followed by 1 blank. If you don't want the blanks in the output just change OFS=", " to OFS=",".

Upvotes: 1

RavinderSingh13
RavinderSingh13

Reputation: 133438

Could you please try following.

awk 'BEGIN{FS=", ";OFS=","} {$1=sprintf("%05d", $1);$3=sprintf("%08d", $3)}1' Input_file

Above will print output field separator as , in case you want to have , then set OFS as OFS=", ".

Output will be as follows.

00123,fred,00012345,end
00023,smith,00000012,end

Explanation: Adding detailed explanation for above code.

awk '                         ##Starting awk program from here.
BEGIN{                        ##Starting BEGIN section of this awk program from here.
  FS=", "                     ##Setting field separator as comma space here.
  OFS=","                     ##Setting OFS(output field separator) as comma here.
}
{
  $1=sprintf("%05d", $1)      ##Setting 1st field of value with sprintf value with 5 zeroes before 1st field value.
  $3=sprintf("%08d", $3)      ##Setting 3rd field of value with sprintf value with 5 zeroes before 3rd field value.
}
1                             ##Mentioning 1 will print current line here.
' Input_file                  ##Mentioning Input_file name here.

Upvotes: 1

Quasímodo
Quasímodo

Reputation: 4004

The fields of the first line are 0123,, fred,, 012345, and end. You modified the first and third to 00123 and 00012345, without a trailing comma. That's what awk prints.

You mean:

awk '{$1=sprintf("%05d,", $1);$3=sprintf("%08d,", $3)}1' mycsv.txt

Output:

00123, fred, 00012345, end
00023, smith, 00000012, end

Upvotes: 1

Related Questions