gbtimmon
gbtimmon

Reputation: 4332

awk, setting FS="," does not seem to work, are there caveats I should know of?

Okay so i am trying to write a simple awk to clear out some commas of some csv files i have.

Here is a few lines of sample data

  PRD,,,,PEWPRV100D,,,EWPRVU457D,,,,12/31/2011  10:09:14 PM,,,,,5,,,4,,
  PRD,,,,PEWPRV100D,,,EWPRVU250D,,,,12/31/2011  10:09:23 PM,,,,,67,,,69,,
  PRD,,,,PEWREF100D,,,EWREFU045D,,,,12/31/2011  10:09:40 PM,,,,,7,,,5,,
  PRD,,,,PEWPRV100D,,,EWPRVU191D,,,,12/31/2011  10:09:40 PM,,,,,6,,,5,,

As a simple first step, I want to perform this (what I ultimatly want to do is more complicated but this is the first thing I need to do and i cant even get this right :( )

   PRD,PEWPRV100D,EWPRVU457D,12/31/2011  10:09:14 PM,5,4,
   PRD,PEWPRV100D,EWPRVU250D,12/31/2011  10:09:23 PM,67,69,
   PRD,PEWREF100D,EWREFU045D,12/31/2011  10:09:40 PM,7,5,
   PRD,PEWPRV100D,EWPRVU191D,12/31/2011  10:09:40 PM,6,5,

Here is my first attempt at an awk script

  #!/bin/awk 
  BEGIN{FS=",";} 
  {print $0,$4,$7,$11,$16,$19 }
  END{print "DONE"}

which produces

  PRD,,,,PEWPRV100D,,,EWPRVU457D,,,,12/31/2011  10:09:14 PM,,,,,5,,,4,,,,,,,
  PRD,,,,PEWPRV100D,,,EWPRVU250D,,,,12/31/2011  10:09:23 PM,,,,,67,,,69,,,,,,,
  PRD,,,,PEWREF100D,,,EWREFU045D,,,,12/31/2011  10:09:40 PM,,,,,7,,,5,,,,,,,
  PRD,,,,PEWPRV100D,,,EWPRVU191D,,,,12/31/2011  10:09:40 PM,,,,,6,,,5,,,,,,,

A more telling script I tried :

  #!/bin/awk 
  BEGIN{FS=",";} 
  {printf("$$%s$$", $0) }
  END{print "DONE"} 

which produces

 $$PRD,,,,PEWPRV100D,,,EWPRVU457D,,,,12/31/2011  10:09:14 PM,,,,,5,,,4,,$$
  $$PRD,,,,PEWPRV100D,,,EWPRVU250D,,,,12/31/2011  10:09:23 PM,,,,,67,,,69,,$$
  $$PRD,,,,PEWREF100D,,,EWREFU045D,,,,12/31/2011  10:09:40 PM,,,,,7,,,5,,$$
  $$PRD,,,,PEWPRV100D,,,EWPRVU191D,,,,12/31/2011  10:09:40 PM,,,,,6,,,5,,$$

showing (i think) that the FS="," is not setting the delimiter to a comma, since the whole line is seen as one column. I have also tried many different form of that line, none seem to make a difference. The man pages of the awk implementation say that FS is the var I should set. i have also tried hte -F flag, which did not help either.

Is there something obvious I am missing here?

Upvotes: 3

Views: 133

Answers (1)

jwodder
jwodder

Reputation: 57580

In awk, $0 is not the first column — it's the entire line. $1 is the first column, the second column is $2, and so forth. Thus, you presumably want to change this:

{print $0,$4,$7,$11,$16,$19 }

to this:

{print $1,$5,$8,$12,$17,$20 }

Upvotes: 3

Related Questions