Reputation: 33
I have this script running in a 1.7GB text file.
#!/bin/bash
File1=$1.tmp
File2=$1.modified
grep '^#' $1 > $File2
grep -v '#' $1 > $File1
while read line; do
column_four=$(echo $line | cut -d " " -f4)
final_line=$(echo $line | cut -d " " -f4-5)
if [ "$column_four" == "0" ]; then
beginning_line=$(echo $line | cut -d " " -f1-3)
final_line=$(echo $line | cut -d " " -f4-5)
else
final_line=$(echo $line | cut -d " " -f1-2)
fi
linef=$(echo "$beginning_line $final_line")
echo $linef | awk '{printf "%5.0f%12.4f%12.4f%5.0f%12.4f\n", $1, $2, $3, $4, $5}' >> $File2
done < $File1
rm -f $File1
The problem: it's very, very slow. It creates a new file with columns arranged in a speed of 200KB per minute with a Core2Duo. How can I make it faster?
Thank you.
Upvotes: 3
Views: 1448
Reputation: 51643
You can the whole thing in awk
, as far as I see, something like
awk '/^#/ { print $0 >> "File2" ; getline}
$0 ! ~ /#/ { if ( $4 == 0 ) {
f1 = $1 ; f2 = $2 ; f3 = $3
printf("%5.0f%12.4f%12.4f%5.0f%12.4f\n", f1, f2, f3, $4, $5) >> "File2" }
else { printf("%5.0f%12.4f%12.4f%5.0f%12.4f\n", f1, f2, f3, $1, $2) >> "File2" }
} INPUTFILE
Upvotes: 3
Reputation: 35038
I would do away with the loop and use a single invocation of awk:
awk '
{
if ($4 == 0) {
f1 = $1;
f2 = $2;
f3 = $3;
f4 = $4;
f5 = $5;
} else {
f4 = $1;
f5 = $2;
}
printf ("%5.0f%12.4f%12.4f%5.0f%12.4f\n", f1, f2, f3, f4, f5);
}' < $File1 >> $File2
That way you're not invoking awk
, echo
and cut
multiple times per line of your input file and are just running a single awk
process.
Upvotes: 3