Reputation: 11
So, I have a file called "test.log" with multiple entries like this:
2022-09-30T11:37:54 START_TEST_CASE Start 'tst_T01-TC02' Test 'tst_T01-TC02' started (tst_T01-TC02)
2022-09-30T11:38:01 PASS shared/scripts/Project/LoginWindow.py:39: Comparison 'True' and 'True' are equal
2022-09-30T11:38:16 END_TEST_CASE End 'tst_T01-TC02' End of test 'tst_T01-TC02'<br>
2022-09-30T11:37:54 START_TEST_CASE Start 'tst_T01-TC01' Test 'tst_T01-TC01' started (tst_T01-TC01)
2022-09-30T11:38:01 PASS shared/scripts/Project/LoginWindow.py:39: Comparison 'True' and 'True' are equal
2022-09-30T11:38:16 END_TEST_CASE End 'tst_T01-TC01' End of test 'tst_T01-TC01'<br>
2022-09-30T11:37:54 START_TEST_CASE Start 'tst_T02-TC01' Test 'tst_T02-TC01' started (tst_T02-TC01)
2022-09-30T11:38:01 FAIL shared/scripts/Project/LoginWindow.py:39: Comparison 'True' and 'True' are equal
2022-09-30T11:38:16 END_TEST_CASE End 'tst_T02-TC01' End of test 'tst_T02-TC01'
What I want is to create X files, each file must contain a single TestCase. I archive this by using the following command:
sed '/START_TEST_CASE/,/END_TEST_CASE/!d' $LOG_FILE_NAME | \
csplit -z --suffix-format="%d.log" - '/END_TEST_CASE/1' '{*}'
Now, the files that I create using this method are called xx5.log
or xx0.log
or similar wording.
What I want is to modify this script in order to call, each created file, by its relevant test case name (this name is written inside the text, in the same row of START_TEST_CASE
)
For example, the first file created, containing the first TestCase by line, must be named tst_T01-TC02.log
, the second tst_T01-TC01.log
, the third tst_T02-TC01.log
, ecc.
How can I achieve this?
Upvotes: 1
Views: 171
Reputation: 58488
This might work for you (GNU sed):
sed '/START/,/END/!d' file | csplit -qz - '/END/1' '{*}' &&
sed -Esn '1F;1s/.*Start (\S+).*/\1/p' xx* |
sed 'N;s/\n/ /;s/^/mv /;s/.$/.log&/' |
sh
After the csplit command is successful, build and execute a script which moves each csplit file result to its own file name taken from within that split file.
The solution is in three parts:
Parse each csplit file and create a file of two line records. Where the first line of the record is the original csplit file name and the second line is the file name to be.
Take the output from above file and condense the two line record structure to a record per single line. Prepend the mv
command to each line and append .log
to the new file name.
Once again, pipe the output from above into a shell and execute the move commands.
N.B. The last step can be removed and result checked before the move commands are executed.
Upvotes: 0
Reputation: 35106
I'm not aware of an 'easy' way to do this with csplit
but if awk
is an option ...
Adding a few more lines to sample input:
$ cat test.log
ignore this line
2022-09-30T11:37:54 START_TEST_CASE Start 'tst_T01-TC02' Test 'tst_T01-TC02' started (tst_T01-TC02)
2022-09-30T11:38:01 PASS shared/scripts/Project/LoginWindow.py:39: Comparison 'True' and 'True' are equal
2022-09-30T11:38:16 END_TEST_CASE End 'tst_T01-TC02' End of test 'tst_T01-TC02'<br>
ignore this line
2022-09-30T11:37:54 START_TEST_CASE Start 'tst_T01-TC01' Test 'tst_T01-TC01' started (tst_T01-TC01)
2022-09-30T11:38:01 PASS shared/scripts/Project/LoginWindow.py:39: Comparison 'True' and 'True' are equal
2022-09-30T11:38:16 END_TEST_CASE End 'tst_T01-TC01' End of test 'tst_T01-TC01'<br>
ignore this line
2022-09-30T11:37:54 START_TEST_CASE Start 'tst_T02-TC01' Test 'tst_T02-TC01' started (tst_T02-TC01)
2022-09-30T11:38:01 FAIL shared/scripts/Project/LoginWindow.py:39: Comparison 'True' and 'True' are equal
2022-09-30T11:38:16 END_TEST_CASE End 'tst_T02-TC01' End of test 'tst_T02-TC01'
ignore this line
One awk
idea (replaces all of OP's current code - sed|csplit
):
awk -v sq="'" ' # define variable "sq" as a single quote
/START_TEST_CASE/ { close(outfile) # close previous output file to keep awk from running out of file descriptors
split($0,a,sq) # split line on single quote
outfile=a[2] ".log" # define new output file name
printme=1 # enable print flag
}
printme { print $0 > outfile } # if print flag enabled (==1) then print current line to "outfile"
/END_TEST_CASE/ { printme=0 } # disable print flag
' test.log
This generates:
$ head tst*log
==> tst_T01-TC01.log <==
2022-09-30T11:37:54 START_TEST_CASE Start 'tst_T01-TC01' Test 'tst_T01-TC01' started (tst_T01-TC01)
2022-09-30T11:38:01 PASS shared/scripts/Project/LoginWindow.py:39: Comparison 'True' and 'True' are equal
2022-09-30T11:38:16 END_TEST_CASE End 'tst_T01-TC01' End of test 'tst_T01-TC01'<br>
==> tst_T01-TC02.log <==
2022-09-30T11:37:54 START_TEST_CASE Start 'tst_T01-TC02' Test 'tst_T01-TC02' started (tst_T01-TC02)
2022-09-30T11:38:01 PASS shared/scripts/Project/LoginWindow.py:39: Comparison 'True' and 'True' are equal
2022-09-30T11:38:16 END_TEST_CASE End 'tst_T01-TC02' End of test 'tst_T01-TC02'<br>
==> tst_T02-TC01.log <==
2022-09-30T11:37:54 START_TEST_CASE Start 'tst_T02-TC01' Test 'tst_T02-TC01' started (tst_T02-TC01)
2022-09-30T11:38:01 FAIL shared/scripts/Project/LoginWindow.py:39: Comparison 'True' and 'True' are equal
2022-09-30T11:38:16 END_TEST_CASE End 'tst_T02-TC01' End of test 'tst_T02-TC01'
Upvotes: 0
Reputation: 179179
I'd use AWK for this:
awk '
/START_TEST_CASE/ {
match($0, /tst_[^'"'"']+/)
test_name = substr($0, RSTART, RLENGTH)
}
/START_TEST_CASE/ , /END_TEST_CASE/ {
print $0 > (test_name ".log")
}
' "$LOG_FILE_NAME"
The weird [^'"'"']
part is actually just [^']
, but we have to escape it for use within a Bash single-quoted string.
Or, if you don't mind a standalone AWK script, you could use this:
/START_TEST_CASE/ {
match($0, /tst_[^']+/)
test_name = substr($0, RSTART, RLENGTH)
}
/START_TEST_CASE/ , /END_TEST_CASE/ {
print $0 > (test_name ".log")
}
And then:
awk -f split.awk "$LOG_FILE_NAME"
Upvotes: 0