Moonrunner
Moonrunner

Reputation: 23

Capture all the logs between two time stamps of a json format log

I have a large JSON format log file that has fields StartDate, StartTime, and each log entry ends with EndDate and EndTime.

My sample Input Log file entry with 4 lines are below. My log file consists of entries for days of data.

{ "Utility":"DBUpdate", "StartDate":"2020-09-21", "StartTime":"14:41:12", "Server":"eaidev", "Userid":"sx50067", "TrueExit":"No", "WaitInterval":30, "Cluster":"1", "Source":"MANNING1", "Target":"MANNING2", "ClusterListCt":5, "ListCt":55, "RequestServer":"MANNING3", "Reply":"JOT4", "ISC(Source)":0, "EndDate":"2020-09-21", "EndTime":"14:41:21", "ExitCode":0 }
{ "Utility":"DBUpdate", "StartDate":"2020-09-22", "StartTime":"14:41:12", "Server":"eaidev", "Userid":"sx50067", "TrueExit":"No", "WaitInterval":30, "Cluster":"1", "Source":"MANNING1", "Target":"MANNING2", "ClusterListCt":5, "ListCt":55, "RequestServer":"MANNING3", "Reply":"JOT4", "ISC(Source)":0, "EndDate":"2020-09-22", "EndTime":"14:41:21", "ExitCode":0 }
{ "Utility":"DBUpdate", "StartDate":"2020-09-23", "StartTime":"14:41:12", "Server":"eaidev", "Userid":"sx50067", "TrueExit":"No", "WaitInterval":30, "Cluster":"1", "Source":"MANNING1", "Target":"MANNING2", "ClusterListCt":5, "ListCt":55, "RequestServer":"MANNING3", "Reply":"JOT4", "ISC(Source)":0, "EndDate":"2020-09-23", "EndTime":"14:41:29", "ExitCode":0 }
{ "Utility":"DBUpdate", "StartDate":"2020-09-23", "StartTime":"14:42:12", "Server":"eaidev", "Userid":"sx50067", "TrueExit":"No", "WaitInterval":30, "Cluster":"1", "Source":"MANNING1", "Target":"MANNING2", "ClusterListCt":5, "ListCt":55, "RequestServer":"MANNING3", "Reply":"JOT4", "ISC(Source)":0, "EndDate":"2020-09-23", "EndTime":"14:43:21", "ExitCode":0 }

In a separate script, I run a job and I capture Start Date and Time and I also capture my End date and End Time into a Temp file like below.

2020-09-23 14:41:12
2020-09-23 14:43:21

I am using variables like the below in my script to capture these times.

DATETIME=$(date '+%Y-%m-%d %T')
DATE=$(echo "${END_DATETIME}" | cut -f1 -d' ')
TIME=$(echo "${END_DATETIME}" | cut -f2 -d' ')

Using my input file data which has a start and end date times of my program, I want to capture all the logfile lines in between my Start Time and End Time and write it to a file.

I expect my new log file to be like this:

{ "Utility":"DBUpdate", "StartDate":"2020-09-23", "StartTime":"14:41:12", "Server":"eaidev", "Userid":"sx50067", "TrueExit":"No", "WaitInterval":30, "Cluster":"1", "Source":"MANNING1", "Target":"MANNING2", "ClusterListCt":5, "ListCt":55, "RequestServer":"MANNING3", "Reply":"JOT4", "ISC(Source)":0, "EndDate":"2020-09-23", "EndTime":"14:41:29", "ExitCode":0 }
{ "Utility":"DBUpdate", "StartDate":"2020-09-23", "StartTime":"14:42:12", "Server":"eaidev", "Userid":"sx50067", "TrueExit":"No", "WaitInterval":30, "Cluster":"1", "Source":"MANNING1", "Target":"MANNING2", "ClusterListCt":5, "ListCt":55, "RequestServer":"MANNING3", "Reply":"JOT4", "ISC(Source)":0, "EndDate":"2020-09-23", "EndTime":"14:43:21", "ExitCode":0 }

I am able to capture logs based on date but when it comes to time, I am getting more than what I want. Can you please suggest?

Upvotes: 2

Views: 267

Answers (1)

thanasisp
thanasisp

Reputation: 5975

Notes

Your data have an obvious pitfall, the use of two different fields (StartDate and StartTime) instead of ONE field, the "datetime", which is standard and well-known across programming languages and data types. If you want to compare dates, then you have to compare combinations of these fields.

Furthermore, if you have to consider more things about these dates, like timezones or daylight saving periods, this structure becomes more frustrating for no reason.

Another note: Here it seems that you use JSON but you treat it as text file, with one record per line, JSON isn't necessarily printed like this, or could have characters in places where they will break a simple text parsing based on column positions or pattern matching.


Using jq

In general, to filter your json and get only those with a field value inside a range:

jq 'select(.StartDate > "2000-09-22" and .StartDate < "2020-09-24")' file.json

You can pass bash variables to the above like this:

#!/bin/bash
start_date="2020-09-22"
end_date="2020-09-24"
    
jq -c --arg s "$start_date" \
      --arg e "$end_date"   \
      'select(.StartDate > $s and .StartDate < $e)' file.json

I have also added -c to print records one per line, because I think you really want this. Now, you can add any variables, any conditions for StartDate, StartTime, and get what you want.


Concatenate date and time

Also, here is a simple way to concatenate {Start|End}{Date|Time} of your JSON into easily sortable datetime fields.

jq -c '.StartDate = "\(.StartDate)_\(.StartTime)" 
      | .EndDate = "\(.EndDate)_\(.EndTime)"
      | del(.StartTime, .EndTime)' file.json

So you will not need to add different conditions for date or time. Output:

{"Utility":"DBUpdate","StartDate":"2020-09-21_14:41:12","Server":"eaidev","Userid":"sx50067","TrueExit":"No","WaitInterval":30,"Cluster":"1","Source":"MANNING1","Target":"MANNING2","ClusterListCt":5,"ListCt":55,"RequestServer":"MANNING3","Reply":"JOT4","ISC(Source)":0,"EndDate":"2020-09-21_14:41:21","ExitCode":0}
{"Utility":"DBUpdate","StartDate":"2020-09-22_14:41:12","Server":"eaidev","Userid":"sx50067","TrueExit":"No","WaitInterval":30,"Cluster":"1","Source":"MANNING1","Target":"MANNING2","ClusterListCt":5,"ListCt":55,"RequestServer":"MANNING3","Reply":"JOT4","ISC(Source)":0,"EndDate":"2020-09-22_14:41:21","ExitCode":0}
{"Utility":"DBUpdate","StartDate":"2020-09-23_14:41:12","Server":"eaidev","Userid":"sx50067","TrueExit":"No","WaitInterval":30,"Cluster":"1","Source":"MANNING1","Target":"MANNING2","ClusterListCt":5,"ListCt":55,"RequestServer":"MANNING3","Reply":"JOT4","ISC(Source)":0,"EndDate":"2020-09-23_14:41:29","ExitCode":0}
{"Utility":"DBUpdate","StartDate":"2020-09-23_14:42:12","Server":"eaidev","Userid":"sx50067","TrueExit":"No","WaitInterval":30,"Cluster":"1","Source":"MANNING1","Target":"MANNING2","ClusterListCt":5,"ListCt":55,"RequestServer":"MANNING3","Reply":"JOT4","ISC(Source)":0,"EndDate":"2020-09-23_14:43:21","ExitCode":0}

Upvotes: 3

Related Questions