Lastwish
Lastwish

Reputation: 327

Bash - Sort File Based on 2 Columns : HH:MM:SS AM/PM

Trying to sort lines in a file. The column1 is HH:MM:SS format and col2 is AM/PM. Need to arrange lines from AM to PM firstly and then progressive time.

Current :

11:36:48 AM col3 ...
11:32:00 AM col3 ...
03:18:54 PM col3 ...
02:26:40 PM col3 ...
01:51:56 PM col3 ...
12:55:58 PM col3 ...
11:58:48 AM col3 ...
09:38:41 AM col3 ...

Final:

09:38:41 AM col3 ...
11:32:00 AM col3 ...
11:36:48 AM col3 ...
11:58:48 AM col3 ...
12:55:58 PM col3 ...
01:51:56 PM col3 ...
02:26:40 PM col3 ...
03:18:54 PM col3 ...

Thanks

Upvotes: 0

Views: 239

Answers (4)

Fravadona
Fravadona

Reputation: 17208

You can generate a 24H equivalent of your AM/PM times and prepend it to the line with awk. Then sort will work fine and all that's left to do is to strip the added data (with sed or cut or whatever):

awk -F '[ :]' '
    {
        H24 = ($1 == 12 ? ($4 == "AM" ? 0 : 12) : ($4 == "PM" ? $1 + 12 : $1))
        print sprintf("%02d%02d%02d", H24, $2, $3), $0
    }
' data.txt |
sort |
sed 's/[^ ]* //'

Upvotes: 3

glenn jackman
glenn jackman

Reputation: 247042

A Schwartzian transform:

while read -ra fields; do
    printf '%s\t%s\n' "$(date -d "${fields[*]:0:2}" '+%s')" "${fields[*]}"
done < input \
| sort -n \
| cut -f 2-

outputs

09:38:41 AM col3 ...
11:32:00 AM col3 ...
11:36:48 AM col3 ...
11:58:48 AM col3 ...
12:55:58 PM col3 ...
01:51:56 PM col3 ...
02:26:40 PM col3 ...
03:18:54 PM col3 ...

The while loop calculates the epoch time for each timestamp and adds it to the output stream. Then it gets sorted by the numeric time. Then that field is removed.

Doing this in bash may be slower than doing it in, say, perl:

perl -MTime::Piece -lane '
        push @lines, [@F];
    } END {
        print join "\n",
              map {"@{$_->[1]}"}
              sort {$a->[0] <=> $b->[0]}
              map {[Time::Piece->strptime("$_->[0] $_->[1]", "%T %p")->epoch, $_]}
              @lines;
' input

Upvotes: 3

Maximilian Ballard
Maximilian Ballard

Reputation: 996

All you have to do is get the AM lines with grep, then use sort on the output. After doing that then you can do the same for the PM lines.

cp "my_file" "backup_file" #Backup old file before overwriting
grep '^[0-9:]* AM' "backup_file" | sort > "my_file"
grep '^[0-9:]* PM' "backup_file" | sort >> "my_file"

Its important to backup your file before using > operation since that will completely overwrite your file.

Upvotes: 2

markp-fuso
markp-fuso

Reputation: 35076

As mentioned in a comment, sorting on the current time column will be problematic in that the 12:xx entries need to come before 01:xx entries.

One approach:

  • convert the 12:xx entries to 00:xx
  • run a normal 2-column sort
  • convert the 00:xx entries back to 12:xx

One sed | sort | sed idea:

$ sed 's/^12/00/' sample.log | sort -k2,2 -k1,1 | sed 's/^00/12/'
09:38:41 AM col3 ...
11:32:00 AM col3 ...
11:36:48 AM col3 ...
11:58:48 AM col3 ...
12:55:58 PM col3 ...
01:51:56 PM col3 ...
02:26:40 PM col3 ...
03:18:54 PM col3 ...

Upvotes: 1

Related Questions