Why does awk skip the second field in first entry?

Question

I have a manually created log file of the format

date start   duration description
2/5  10:00p  1:45     Did this and that.
2/6  2:00a   0:20     Woke up from my slumber.
==============================================
             2:05     TOTAL time spent

There are many entries in the log. To avoid manually recomputing total time every time an entry is added, I wrote the following script:

#!/bin/bash
file=`ls | grep log`
head -n -1 $file | egrep -o  [0-9]:[0-9]{2}[^ap] \
 | awk '{ FS = ":" ; SUM += 60*$1 ; SUM += $2 } END { print SUM }'

First, the script assumes there is exactly one file with log in its name, and that's the file I'm after. Second, it takes all lines other than the line with the current total, greps the time information from the line, and feeds it to awk, which converts it to minutes.

This is where I run into problems. The final sum would always be slightly off. Through trial and error, I discovered that awk will never count the second field of the very first record, e.g. the 45 minutes in this case. It will count the hour; it won't count the minutes. It has no such problem with the other records, but it's always off by the minutes in the first record.

What could be causing this behavior? How do I debug it?

SylvainD · Accepted Answer

You set FS in the loop and it's already too late for the first line.

The right way to do is :

echo -e "1:45
0:20" | awk 'BEGIN { FS=":" } { SUM += 60*$1 + $2 } END { print SUM }'

Why does awk skip the second field in first entry?

Answers (2)

Related Questions