How to process large csv files efficiently using shell script, to get better performance than that for following script?

Question

I have a large csv file input_file with 5 columns. I want to do two things to second column:

(1) Remove last character (2) Append leading and trailing single quote

Following are the sample rows from input_file.dat

420374,2014-04-06T18:44:58.314Z,214537888,12462,1
420374,2014-04-06T18:44:58.325Z,214537850,10471,1
281626,2014-04-06T09:40:13.032Z,214535653,1883,1

Sample output would look like :

420374,'2014-04-06T18:44:58.314',214537888,12462,1
420374,'2014-04-06T18:44:58.325',214537850,10471,1
281626,'2014-04-06T09:40:13.032',214535653,1883,1

I have written a following code to do the same.

#!/bin/sh
inputfilename=input_file.dat
outputfilename=output_file.dat
count=1

while read line
do
  echo $count
  count=$((count + 1))
  v1=$(echo $line | cut -d ',' -f1)
  v2=$(echo $line | cut -d ',' -f2)
  v3=$(echo $line | cut -d ',' -f3)
  v4=$(echo $line | cut -d ',' -f4)
  v5=$(echo $line | cut -d ',' -f5)
  v2len=${#v2}
  v2len=$((v2len -1))
  newv2=${v2:0:$v2len}
  newv2="'$newv2'"
  row=$v1,$newv2,$v3,$v4,$v5
  echo $row >> $outputfilename
done < $inputfilename

But it's taking lot of time.

Is there any efficient way to achieve this?

henfiber · Accepted Answer

You can do this with awk

awk -v q="'" 'BEGIN{FS=OFS=","} {$2=q substr($2,1,length($2)-1) q}1' input_file.dat

How it works:

BEGIN{FS=OFS=","} : set input and output field separator (FS, OFS) to ,.
-v q="'" : assign a literal single quote to the variable q (to avoid complex escaping in the awk expression)
{$2=q substr($2,1,length($2)-1) q} : Replace the second field ($2) with a single quote (q) followed by the value of the 2nd field without the last character (substr(string, start, length)) and appending a literal single quote (q) at the end.
1 : Just invoke the default action, which is print the current (edited) line.

How to process large csv files efficiently using shell script, to get better performance than that for following script?

Answers (1)

Related Questions