blue212121
blue212121

Reputation: 23

How to find same string in consecutive lines and then print all lines that consecutively contain them

I have a ping logfile with format

2021/02/15 14:22:27 : Reply[1] from 10.10.10.1: bytes=32 time=31.9 ms TTL=244 jitter=0.00 ms
2021/02/15 14:22:27 : Reply[2] from 10.10.10.1: bytes=32 time=32.5 ms TTL=244 jitter=0.03 ms
2021/02/15 14:22:28 : 10.10.10.1: request timed out
2021/02/15 14:22:28 : Reply[4] from 10.10.10.1: bytes=32 time=29.9 ms TTL=244 jitter=0.28 ms
2021/02/15 14:22:29 : Reply[5] from 10.10.10.1: bytes=32 time=27.4 ms TTL=244 jitter=0.42 ms
2021/02/15 14:22:29 : Reply[6] from 10.10.10.1: bytes=32 time=31.3 ms TTL=244 jitter=0.63 ms
2021/02/15 14:22:30 : 10.10.10.1: request timed out
2021/02/15 14:22:31 : 10.10.10.1: request timed out
2021/02/15 14:22:31 : 10.10.10.1: request timed out
2021/02/15 14:22:32 : Reply[10] from 10.10.10.1: bytes=32 time=33.8 ms TTL=244 jitter=0.91 ms

I'm only looking for lines where it doesnt reply for 2 or more times (only one timeout at a time is ok).

I've tried using awk like this, but the thing is it prints every line that matches what i want except the very last one...

awk -F " : " "($2 !~ /Reply/ && $2 == prev2) {print prevline} {prev2 = $2; prevline = $0} <file>

My question is how would you modify the awk command to print the very last line as well that consecutively matches the criteria? Or maybe a python solution? Or something with regex?

so the expected output would be

2021/02/15 14:22:30 : 10.10.10.1: request timed out
2021/02/15 14:22:31 : 10.10.10.1: request timed out
2021/02/15 14:22:31 : 10.10.10.1: request timed out

since it happens on 2 or more consecutive lines, but the awk i have only prints out the first two lines and not the last (it doesnt print the time out from 14:22:28 since it happens only once, but that is expected!)

Upvotes: 2

Views: 433

Answers (2)

karakfa
karakfa

Reputation: 67507

something like this should work, perhaps can be simplified further

$ awk '/Reply/{c=0} c==1{print p} c; !/Reply/{c++; p=$0}' file

2021/02/15 14:22:30 : 10.10.10.1: request timed out
2021/02/15 14:22:31 : 10.10.10.1: request timed out
2021/02/15 14:22:31 : 10.10.10.1: request timed out

keep a counter c to count the interested consecutive lines. Keep the copy of the first one and only print if we're still in the consecutive block (meaning c is not reset). Continue printing in the block. Last statement has some redundancy since it's always keeping the previous line but doesn't have much impact.

c; is shorthand for c!=0{print}

Upvotes: 2

RavinderSingh13
RavinderSingh13

Reputation: 133518

With your shown samples, could you please try following.

awk '
!/request timed out/{
  if(count>=2){ print val }
  val=""
  count=0
}
/request timed out/{
  count++
  val=(val?val ORS:"")$0
}
END{
  if(count>=2){ print val }
}
'  Input_file

Explanation: Adding detailed explanation for above.

awk '                           ##Starting awk program from here.
!/request timed out/{           ##Checking condition if line is NOT having request timed out then do following.
  if(count>=2){ print val }     ##Checking condition if count is greater than equal to 2 then print val here.
  val=""                        ##Nullify val here.
  count=0                       ##Setting count to 0 here.
}
/request timed out/{            ##Checking condition if request timed out found in line then do following.
  count++                       ##Increasing count value with 1 here.
  val=(val?val ORS:"")$0        ##Adding line into val and keep concatenating its value to it.
}
END{                            ##Starting END block of this code here.
  if(count>=2){ print val }     ##Checking condition if count is greater than equal to 2 then print val here.
}
'  Input_file                   ##mentioning Input_file name here.

Upvotes: 3

Related Questions