user1613270
user1613270

Reputation: 533

Bash script to find lines belonging to each other

I have a log file containing such output:

  [mvn] Running com.mypackage.MyTest
   ...
  [mvn] Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 104.648 sec
  [mvn] Running com.mypackage.MyNotExecutedTest
   ...
  [mvn] Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.525 sec
  [mvn] Running com.mypackage.AnotherNotExecutedTest
   ...
  [mvn] Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.569 sec
  [mvn] Running com.mypackage.FailedTest
   ...
  [mvn] Tests run: 5, Failures: 2, Errors: 0, Skipped: 0, Time elapsed: 22.357 sec
   ...

whereas there can be any arbitrary number of lines where "..." is (e.g. stack trace, some debug output). What I want to achieve is a list of tests which hasn't been executed:

  com.mypackage.MyNotExecutedTest
  com.mypackage.AnotherNotExecutedTest

So my approach was to grep for pattern "Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed" but then I would somehow need a smart way to find out what Test belongs the grep pattern. Any good/elegant solutions here? Thanks!

Upvotes: 0

Views: 142

Answers (3)

nullrevolution
nullrevolution

Reputation: 4137

i would do this with several grep commands and an awk, all piped together. i'll walk you through my logic:

1) use pcregrep instead of grep to match a multi-line pattern beginning with "Running" and ending with "Tests run: 0" as follows:

command:

pcregrep -M "Running(\n|.)*?Tests run: 0" file.txt

(note the use of the -M argument to allow multi-line matches and the ? after the asterisk to make it non-greedy)

output:

[mvn] Running com.mypackage.MyTest
...
[mvn] Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 104.648 sec
[mvn] Running com.mypackage.MyNotExecutedTest
...
[mvn] Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.525 sec
[mvn] Running com.mypackage.AnotherNotExecutedTest
...
[mvn] Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.569 sec

2) as you can see, this unfortunately matched some unwanted items also, so i'd use pcregrep again to remove the offending entries as follows:

command:

pcregrep -M "Running(\n|.)*?Tests run: 0" file.txt | \
pcregrep -Mv "Running(\n|.)*?Tests run: [^0]"

(note the use of the -v argument and the [^0] character class in the second pcregrep command to eliminate only processes which ran a non-zero number of tests)

output:

[mvn] Running com.mypackage.MyNotExecutedTest
...
[mvn] Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.525 sec
[mvn] Running com.mypackage.AnotherNotExecutedTest
...
[mvn] Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.569 sec

3) then i'd grep out just the lines containing "Running":

command:

pcregrep -M "Running(\n|.)*?Tests run: 0" file.txt | \
pcregrep -Mv "Running(\n|.)*?Tests run: [^0]" | \
grep -i running

output:

[mvn] Running com.mypackage.MyNotExecutedTest
[mvn] Running com.mypackage.AnotherNotExecutedTest

4) and finally use awk to print only the variable i'm interested in seeing (the process name, which by your example seems to always be the third "word" in the row):

final command:

pcregrep -M "Running(\n|.)*?Tests run: 0" file.txt | \
pcregrep -Mv "Running(\n|.)*?Tests run: [^0]" | \
grep -i running | \
awk '{print $3};'

final output:

com.mypackage.MyNotExecutedTest
com.mypackage.AnotherNotExecutedTest

hth!

Upvotes: 1

Ansgar Wiechers
Ansgar Wiechers

Reputation: 200443

I'd probably do it with a combination of grep and awk:

grep -A1 "Tests run: 0" | awk '/Running {print $NF}'

Upvotes: 2

tripleee
tripleee

Reputation: 189739

Write an awk script which stores the latest Running line, then prints the stored line if it sees Tests run: 0.

awk '/\[mvn\] Running /{ t=$3 }
  /\[mvn\] Tests run: 0/ { print t }'  logfile

Edit: I took out the line beginning anchors, in order to correctly cope with indented input.

Upvotes: 4

Related Questions