Reputation: 4332
Okay so i am trying to write a simple awk to clear out some commas of some csv files i have.
Here is a few lines of sample data
PRD,,,,PEWPRV100D,,,EWPRVU457D,,,,12/31/2011 10:09:14 PM,,,,,5,,,4,,
PRD,,,,PEWPRV100D,,,EWPRVU250D,,,,12/31/2011 10:09:23 PM,,,,,67,,,69,,
PRD,,,,PEWREF100D,,,EWREFU045D,,,,12/31/2011 10:09:40 PM,,,,,7,,,5,,
PRD,,,,PEWPRV100D,,,EWPRVU191D,,,,12/31/2011 10:09:40 PM,,,,,6,,,5,,
As a simple first step, I want to perform this (what I ultimatly want to do is more complicated but this is the first thing I need to do and i cant even get this right :( )
PRD,PEWPRV100D,EWPRVU457D,12/31/2011 10:09:14 PM,5,4,
PRD,PEWPRV100D,EWPRVU250D,12/31/2011 10:09:23 PM,67,69,
PRD,PEWREF100D,EWREFU045D,12/31/2011 10:09:40 PM,7,5,
PRD,PEWPRV100D,EWPRVU191D,12/31/2011 10:09:40 PM,6,5,
Here is my first attempt at an awk script
#!/bin/awk
BEGIN{FS=",";}
{print $0,$4,$7,$11,$16,$19 }
END{print "DONE"}
which produces
PRD,,,,PEWPRV100D,,,EWPRVU457D,,,,12/31/2011 10:09:14 PM,,,,,5,,,4,,,,,,,
PRD,,,,PEWPRV100D,,,EWPRVU250D,,,,12/31/2011 10:09:23 PM,,,,,67,,,69,,,,,,,
PRD,,,,PEWREF100D,,,EWREFU045D,,,,12/31/2011 10:09:40 PM,,,,,7,,,5,,,,,,,
PRD,,,,PEWPRV100D,,,EWPRVU191D,,,,12/31/2011 10:09:40 PM,,,,,6,,,5,,,,,,,
A more telling script I tried :
#!/bin/awk
BEGIN{FS=",";}
{printf("$$%s$$", $0) }
END{print "DONE"}
which produces
$$PRD,,,,PEWPRV100D,,,EWPRVU457D,,,,12/31/2011 10:09:14 PM,,,,,5,,,4,,$$
$$PRD,,,,PEWPRV100D,,,EWPRVU250D,,,,12/31/2011 10:09:23 PM,,,,,67,,,69,,$$
$$PRD,,,,PEWREF100D,,,EWREFU045D,,,,12/31/2011 10:09:40 PM,,,,,7,,,5,,$$
$$PRD,,,,PEWPRV100D,,,EWPRVU191D,,,,12/31/2011 10:09:40 PM,,,,,6,,,5,,$$
showing (i think) that the FS="," is not setting the delimiter to a comma, since the whole line is seen as one column. I have also tried many different form of that line, none seem to make a difference. The man pages of the awk implementation say that FS is the var I should set. i have also tried hte -F flag, which did not help either.
Is there something obvious I am missing here?
Upvotes: 3
Views: 133
Reputation: 57580
In awk
, $0
is not the first column — it's the entire line. $1
is the first column, the second column is $2
, and so forth. Thus, you presumably want to change this:
{print $0,$4,$7,$11,$16,$19 }
to this:
{print $1,$5,$8,$12,$17,$20 }
Upvotes: 3