Rory Lester
Rory Lester

Reputation: 2918

sed/awk unix csv file modification

I have a directory that is receiving .csv files.

column1,column2,column3,columb4
value1,0021,value3,value4,
value1,00211,value3,value4,

I want remove the header, pad the second column to 6 digits and add ":" so it is in HH:MM:SS format. e.g.

value1,00:00:21,value3,value4,
value1,00:02:11,value3,value4,

I can pad the characters to 6 digits using awk but I am not sure to insert the semicolumn every 2 characters for the second $2. Else can this be fully done in sed? which would be better for performance?

Thank you

Upvotes: 0

Views: 107

Answers (3)

RomanPerekhrest
RomanPerekhrest

Reputation: 92854

With awk formatting + substitution magic:

awk 'BEGIN{ FS = OFS = "," }
     NR > 1{ $2=sprintf("%06d", $2); gsub(/[0-9]{2}/, "&:", $2); 
             $2=substr($2, 0, 8); print }' file

The output:

value1,00:00:21,value3,value4,
value1,00:02:11,value3,value4,

Upvotes: 2

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626845

You may do it all with GNU awk:

awk 'BEGIN{FS=OFS=","} {$2=sprintf("%06d", $2); $2=substr($2,1,2) gensub(/.{2}/,":&","g",substr($2,3))}1' file

See an online demo

Details

  • BEGIN{FS=OFS=","} - sets input/output field separator to a comma
  • $2=sprintf("%06d", $2) - pads Field 2 with zeros
  • $2=substr($2,1,2)""gensub(/.{2}/,":&","g",substr($2,3)) - sets Field 2 value equal to a first two chars of the field (substr($2,1,2)) plus the field substring starting from the third char with : inserted before each two char chunk.
  • 1 - default print action.

Upvotes: 2

karakfa
karakfa

Reputation: 67497

with sed

$ sed -nE '2,$s/,([0-9]+)/,00000\1/;s/,0+(..)(..)(..),/,\1:\2:\3,/p' file

value1,00:00:21,value3,value4,
value1,00:02:11,value3,value4,

I think it can be simplified little bit.

Upvotes: 1

Related Questions