Reputation: 121
I have this bash script to remove columns from lines of a given csv file, but it runs very slowly. I need to use this script for files larger than 1GB, so I'm looking for a faster solution.
#!/bin/bash
while read line; do
columns=`echo $line | awk '{print NF}' FS=,`
if [ "$columns" == "9" ]; then
echo `echo $line | cut -d \, -f 1,5,6,8,9`
elif [ "$columns" == "24" ]; then
echo `echo $line | cut -d \, -f 1,5,6,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24`
elif [ "$columns" == "8" ]; then
echo `echo $line | cut -d \, -f 1,4,5,6,7,8`
else
echo $line
fi
done <$1
If anyone has advice on how to speed this up or if theres a better way to do it, that'd be awesome. Thanks a lot!
Upvotes: 1
Views: 102
Reputation: 784958
Your entire script can be handled by a single awk.
Try this:
awk 'BEGIN{FS=OFS=","}
NF==9 {print $1, $5, $6, $8, $9; next}
NF==8 {print $1, $4, $5, $6, $8; next}
NF==24{print $1,$4,$5,$6,$8,$10,$11,$12,$13,$14,$15,$16,$17,$18,$19,$20,$21,$22,$23,$24} "$1"
Upvotes: 1