Chris Charles
Chris Charles

Reputation: 73

Use AWK (or SED) to get text between strings - include START string but exclude END string

I am trying to use AWK (or SED or a combination of both), to parse out log files that contain a spefic string "Info:AgentSession". I want to INCLUDE the line that contains the START string of "Info:AgentSession", but not include the END string line, which would be "[2015-" .

Here is a snippet of a text log file on a CentOS server:


[2015-03-30 12:23:10.999] [124] [Info:AgentSession] Handling Agent message for PieraC 
Request: ReceiveReady
Action: DoNotDisturb

[2015-03-30 12:23:11.000] [124] [Info:AgentSession] Sending agent message to PieraC 
Response: ReceiveReady
RequestId: 
Status: Ok
Message: 
IsReady: False

[2015-03-30 12:23:11.000] [49] [Info:Database] (BZ2411) (SqlTaskWorker.ProcessTasks) Attempting to run task. Thread: SqlTaskWorker-37. StartTime: 1/1/0001 12:00:00 AM. ConnectionTimeout: 15. ConnectionState: Open.

[2015-03-30 12:23:11.501] [111] [Info:Dialer] Sending Dialer message
Action: UsmCommand
Command: Transfer
IsTransfered: False

[2015-03-30 12:23:11.502] [111] [Info:AgentSession] Sending agent message to MatthewW 
ActivityState: Wrapup
IsReady: False
IsSipRegistered: True

[2015-03-30 12:23:11.502] [79] [Info:Database] (BZ2411) (SqlTask.Execute) Attempting to start. Thread: SqlTaskWorker-67. 

[2015-03-30 12:23:16.207] [124] [Info:AgentSession] Sending agent message to PieraC 
Response: NonQuery
Status: Ok
Message: Query sent successfully

[2015-03-30 12:23:16.207] [88] [Info:Database] (BZ2411) (SqlTaskWorker.ProcessTasks) Attempting to run task. Thread: SqlTaskWorker-76. 
[2015-03-30 12:23:16.207] [88] [Info:Database] (BZ2411) (SqlTask.Execute) Attempting to start. Thread: SqlTaskWorker-76. 
[2015-03-30 12:23:16.208] [88] [Info:Database] (BZ2411) (SqlNonQueryTask.ExecuteCommand) Attempting to start. Thread: SqlTaskWorker-76. 
[2015-03-30 12:23:16.268] [124] [Info:AgentSession] Handling Agent message for PieraC 
Request: CallAction
CallDisposition: 


When I run the following command:


awk '/Info:AgentSession/ {flag=1;next} /\[2015-/{flag=0} flag {print}' test.log


I get the following output:


Request: ReceiveReady
Action: DoNotDisturb

Response: ReceiveReady
RequestId:
Status: Ok
Message:
IsReady: False

ActivityState: Wrapup
IsReady: False
IsSipRegistered: True

Response: NonQuery
Status: Ok
Message: Query sent successfully

Request: CallAction
CallDisposition:


But I would like this output, to INCLUDE the START string of "Info:AgentSession", so to actually end up looking like this (Omitting all other sections of the log that do not referece the START string, using the beginning of the DATE string "[2015-" as the END string):


[2015-03-30 12:23:10.999] [124] [Info:AgentSession] Handling Agent message for PieraC 
Request: ReceiveReady
Action: DoNotDisturb

[2015-03-30 12:23:11.000] [124] [Info:AgentSession] Sending agent message to PieraC 
Response: ReceiveReady
RequestId: 
Status: Ok
Message: 
IsReady: False

[2015-03-30 12:23:11.502] [111] [Info:AgentSession] Sending agent message to MatthewW 
ActivityState: Wrapup
IsReady: False
IsSipRegistered: True


[2015-03-30 12:23:16.207] [124] [Info:AgentSession] Sending agent message to PieraC 
Response: NonQuery
Status: Ok
Message: Query sent successfully

[2015-03-30 12:23:16.268] [124] [Info:AgentSession] Handling Agent message for PieraC 
Request: CallAction
CallDisposition: 


Is this possible to do with a simple AWK or SED command?

Upvotes: 3

Views: 391

Answers (3)

potong
potong

Reputation: 58528

This might work for you (GNU sed):

sed -n '/Info:AgentSession/,/^$/p' file

Upvotes: 0

hek2mgl
hek2mgl

Reputation: 158170

You can use a simple loop with sed:

sed -n '/Info:AgentSession/{:a;p;n;/^$/!ba;p}' input.file

The command searches for a line containing the pattern /Info:AgentSession/. If such a line appears, the following block between the curly braces {} get's executed. In that block, we define a start label for the loop call it simply :a. Then we print the current line p, get the next line from input n and check if it is empty /^$/. If line is not empty ! we step back to start of the loop ba. Otherwise we print that empty line as the record separator and start again searching for /Info:AgentSession/ on the next line of input.

Output of other lines is suppressed using the -n command line option.

Output:

[2015-03-30 12:23:10.999] [124] [Info:AgentSession] Handling Agent message for PieraC 
Request: ReceiveReady
Action: DoNotDisturb
[2015-03-30 12:23:11.000] [124] [Info:AgentSession] Sending agent message to PieraC 
Response: ReceiveReady
RequestId: 
Status: Ok
Message: 
IsReady: False

[2015-03-30 12:23:11.502] [111] [Info:AgentSession] Sending agent message to MatthewW 
ActivityState: Wrapup
IsReady: False
IsSipRegistered: True

[2015-03-30 12:23:16.207] [124] [Info:AgentSession] Sending agent message to PieraC 
Response: NonQuery
Status: Ok
Message: Query sent successfully

[2015-03-30 12:23:16.268] [124] [Info:AgentSession] Handling Agent message for PieraC 
Request: CallAction
CallDisposition: 

An alternative would be to use awk like this:

awk -F'\n' '$1 ~ /Info:AgentSession/' RS='\n\n' ORS='\n\n' input.file

I define the input and output separator as a sequence of two newlines. The field separator is a single newline. If the first field of our record contains the pattern Info:AgentSession we print the whole record.


Btw, the sed command above can be also written without the -n option:

sed '/Info:AgentSession/{:a;n;/^$/!ba;p};d' input.file

In this case we are searching for a line containing /Info:AgentSession/ and execute the following block between the curly braces if such a line was found. We define a label :a, print the current line and get the next line from input n. As long as non empty lines will follow /^$/! we step back to the start of the loop ba, otherwise we print that empty line as the record separator p. All other lines gets deleted d.

Upvotes: 0

John1024
John1024

Reputation: 113934

Using awk:

awk '/^[[]/{f=0} /Info:AgentSession/{f=1} f' file

How it works

awk loops through each line of input. For each line, the program decides whether to set the variable f to true (1) or false (0). If f is true, the line is printed.

  • /^[[]/{f=0}

    Anytime a line begins with [, f is set to false.

  • /Info:AgentSession/{f=1}

    If the line contains the string Info:AgentSession, then the previous command is overridden and f is set to true.

  • f

    If f is true, then awk prints the line.

    The above is shorthand for f{print $0} where, in awk, $0 means the whole line.

Upvotes: 1

Related Questions