Baguma
Baguma

Reputation: 353

Loop through text file using date search using sed or awk

Hello sed and awk experts am trying to loop through a log file with contents below while using date time search.
I would like to loop between 2021/08/09 11:18:10 and 2021/08/09 10:49:32 The date format is YYYY/MM/DD

2021/08/09 10:22:24 202108091022G5a Outbound text with ref number 2
2021/08/09 10:31:44 202108091031GhG Outbound text with ref number 3
2021/08/09 10:31:51 202108091031KZL Outbound text with ref number 4
2021/08/09 10:49:32 2021080910496ZT Outbound text with ref number 5
2021/08/09 11:02:27 2021080911025eQ Outbound text with ref number 6
2021/08/09 11:14:28 202108091114Aim Outbound text with ref number 6
2021/08/09 11:15:13 202108091115bRi Outbound text with ref number 7
2021/08/09 11:17:11 202108091117KIK Outbound text with ref number 8
2021/08/09 11:18:10 202108091118dB5 Outbound text with ref number 9
2021/08/09 11:18:17 202108091118qxN Outbound text with ref number 10
2021/08/09 11:19:28 202108091119TuI Outbound text with ref number 11 

I tried mine below but it's returning errors

sed -n '/2021/08/09 09:00:00/,/2021/08/09 11:00:00/p'  Desktop/smslog.log
sed: -e expression #1, char 7: unknown command: `0'

Upvotes: 1

Views: 155

Answers (4)

Mahdy Mirzade
Mahdy Mirzade

Reputation: 351

First mistake

In searching and expressions we have something named special characters, like /, *, ... that to use them in normal mode, you'll need to add \ in the start: \/, \*, \....

In conclusion, this code is wrong:

$ sed -n '/2021/08/09 09:00:00/,/2021/08/09 11:00:00/p' Desktop/smslog.log

Correct syntax:

$ sed -n '\#^2021/08/09 10:#,\#^2021/08/09 11:#p' Desktop/smslog.log
  • In this code we used a different format from my original post that Ed suggested, but you get the idea.

Second mistake

Your sed -n command will only look upto specified lines that contain exactly your BEGIN/END, but in this query, we don't have any log exactly for 2021\/08\/09 09:00:00, so this doesn't show you anything because it doesn't find that log to start and end with your END point.

For example you can do this to get every 11:--:-- to 11:--:-- messages:

$ sed -n '\#^2021/08/09 11:#,\#^2021/08/09 11:#p' Desktop/smslog.log
2021/08/09 11:02:27 2021080911025eQ Outbound text with ref number 6
2021/08/09 11:14:28 202108091114Aim Outbound text with ref number 6
2021/08/09 11:15:13 202108091115bRi Outbound text with ref number 7
2021/08/09 11:17:11 202108091117KIK Outbound text with ref number 8
2021/08/09 11:18:10 202108091118dB5 Outbound text with ref number 9
2021/08/09 11:18:17 202108091118qxN Outbound text with ref number 10
2021/08/09 11:19:28 202108091119TuI Outbound text with ref number 11

$ sed -n '\#^2021/08/09 10:#,\#^2021/08/09 10:#p' Desktop/smslog.log
2021/08/09 10:22:24 202108091022G5a Outbound text with ref number 2
2021/08/09 10:31:44 202108091031GhG Outbound text with ref number 3
2021/08/09 10:31:51 202108091031KZL Outbound text with ref number 4
2021/08/09 10:49:32 2021080910496ZT Outbound text with ref number 5

$ sed -n '\#^2021/08/09 10:#,\#^2021/08/09 11:#p' Desktop/smslog.log
2021/08/09 10:22:24 202108091022G5a Outbound text with ref number 2
2021/08/09 10:31:44 202108091031GhG Outbound text with ref number 3
2021/08/09 10:31:51 202108091031KZL Outbound text with ref number 4
2021/08/09 10:49:32 2021080910496ZT Outbound text with ref number 5
2021/08/09 11:02:27 2021080911025eQ Outbound text with ref number 6
2021/08/09 11:14:28 202108091114Aim Outbound text with ref number 6
2021/08/09 11:15:13 202108091115bRi Outbound text with ref number 7
2021/08/09 11:17:11 202108091117KIK Outbound text with ref number 8
2021/08/09 11:18:10 202108091118dB5 Outbound text with ref number 9
2021/08/09 11:18:17 202108091118qxN Outbound text with ref number 10
2021/08/09 11:19:28 202108091119TuI Outbound text with ref number 11 

Upvotes: 4

Ed Morton
Ed Morton

Reputation: 203665

Using any awk in any shell on every Unix box:

$ awk -v beg='2021/08/09 09:00:00' -v end='2021/08/09 11:00:00' '
    {cur=$1" "$2} (beg<=cur) && (cur<=end)
' file
2021/08/09 10:22:24 202108091022G5a Outbound text with ref number 2
2021/08/09 10:31:44 202108091031GhG Outbound text with ref number 3
2021/08/09 10:31:51 202108091031KZL Outbound text with ref number 4
2021/08/09 10:49:32 2021080910496ZT Outbound text with ref number 5

or more efficiently but a bit less clearly:

$ awk -v beg='2021/08/09 09:00:00' -v end='2021/08/09 11:00:00' '
    {cur=$1" "$2} cur>end{exit} beg<=cur
' file

You don't need GNU awk for time functions for this, all you need is a simple string comparison of the timestamps.

The reason not to try to use sed with a range expression for this is that that would require the specific timestamps used in the date command to be present in the input file (i.e. it can only support a == comparison) whereas with awk you can use >= for the start and > for the end to get timestamps within the range without those specific timestamps having to be present.

Upvotes: 4

Tyl
Tyl

Reputation: 5252

Better using awk to get more flexibility (GNU awk):

$ awk -F[/:[:blank:]] 'BEGIN{start=mktime("2021 08 09 09 00 00");end=mktime("2021 08 09 11 00 00")} {t=mktime($1 " " $2 " " $3 " " $4 " " $5 " " $6)} t>start && t<end' file.txt
2021/08/09 10:22:24 202108091022G5a Outbound text with ref number 2
2021/08/09 10:31:44 202108091031GhG Outbound text with ref number 3
2021/08/09 10:31:51 202108091031KZL Outbound text with ref number 4
2021/08/09 10:49:32 2021080910496ZT Outbound text with ref number 5
$ awk -F[/:[:blank:]] 'BEGIN{start=mktime("2021 08 09 10 40 00");end=mktime("2021 08 09 11 20 00")} {t=mktime($1 " " $2 " " $3 " " $4 " " $5 " " $6)} t>start && t<end' file.txt
2021/08/09 10:49:32 2021080910496ZT Outbound text with ref number 5
2021/08/09 11:02:27 2021080911025eQ Outbound text with ref number 6
2021/08/09 11:14:28 202108091114Aim Outbound text with ref number 6
2021/08/09 11:15:13 202108091115bRi Outbound text with ref number 7
2021/08/09 11:17:11 202108091117KIK Outbound text with ref number 8
2021/08/09 11:18:10 202108091118dB5 Outbound text with ref number 9
2021/08/09 11:18:17 202108091118qxN Outbound text with ref number 10
2021/08/09 11:19:28 202108091119TuI Outbound text with ref number 11

By using mktime to get and compare the timestamps, you can specify whatever range you want.

Break the lines to make it clearer:

awk -F[/:[:blank:]] '
    BEGIN{start=mktime("2021 08 09 09 00 00");
    end=mktime("2021 08 09 11 00 00")} 
    {t=mktime($1 " " $2 " " $3 " " $4 " " $5 " " $6)} 
    t>start && t<end' file.txt

awk -F[/:[:blank:]] '
    BEGIN{start=mktime("2021 08 09 10 40 00");
    end=mktime("2021 08 09 11 20 00")} 
    {t=mktime($1 " " $2 " " $3 " " $4 " " $5 " " $6)}
     t>start && t<end' file.txt

Upvotes: 1

Renaud Pacalet
Renaud Pacalet

Reputation: 29177

sed is probably not the best tool for this. If you have GNU awk you could give it a try: the mktime extension converts date/time strings to timestamps that can then be compared numerically:

$ cat foo.awk
function d2t(d) { gsub(/(\/|:)/," ",d); return mktime(d,1); }
BEGIN { a=d2t(x); b=d2t(y); }
{ d=d2t($1 " " $2); }
d>=a && d<=b

The d2t function does the conversion of your date/time strings (considered as in UTC zone to avoid DTS shifts) to UNIX timestamps. It first replaces the / and : characters by spaces with gsub.

Pass your start and stop date/times as the x and y variables. Demo:

$ awk -f foo.awk -v x="2021/08/09 09:00:00" -v y="2021/08/09 11:00:00" data.txt
2021/08/09 10:22:24 202108091022G5a Outbound text with ref number 2
2021/08/09 10:31:44 202108091031GhG Outbound text with ref number 3
2021/08/09 10:31:51 202108091031KZL Outbound text with ref number 4
2021/08/09 10:49:32 2021080910496ZT Outbound text with ref number 5

Upvotes: 3

Related Questions