Reputation: 1885
I want to write an efficient awk script that will take a file similar to the excerpt shown below and print a certain line (for instance, the line beginning with "Time (UTC):") from each matching record. I believe there's a better way than what I've done in the past to do this.
Example file (sorry I don't know how to put blank lines in the code box. They're represented by "BLANK LINE"):
Processor: Some_Proc
Capsule abortion no 32
Time (UTC): Fri Jun 15 06:25:10 2012
CapsuleId: 1704167
CapsuleName: SomeAppProc
Reason: Assertion "Reason1"
BLANK LINE
Processor: Some_Proc
Capsule abortion no 33
Time (UTC): Fri Jun 15 06:25:10 2012
CapsuleId: 1704168
CapsuleName: SomeAppProc
Reason: Assertion "Reason2"
BLANK LINE
Processor: Some_Proc
Capsule abortion no 34
Time (UTC): Fri Jun 15 06:25:10 2012
CapsuleId: 1704168
CapsuleName: SomeAppProc
Reason: Assertion "Reason1"
Previous code example (sorry I don't know how to preserve indentation in this forum I tried 8 spaces, but that didn't work)
BEGIN {
RS="" #Each record is a "paragraph"
FS="\n" #Each field is a line
}
/Reason1/ {
# print $3 would work if it always shows up on the third line
# but the following for loop should find it if it's on a different line
for (i=1;i<=NF;i++) {
if ($i ~ /^Time.*/) {
print $i
next
}
}
}
Is there more efficient way to print the line if it doesn't always occur in the same order?
Thanks
Upvotes: 2
Views: 773
Reputation: 4353
How about something like this?:
BEGIN { reset(); }
END { reset(); }
$0 == "" { reset(); }
/^Reason:/ && $3 == "\"Reason1\"" { found = 1; }
/^Time \(UTC\):/ { time = $0; }
function reset() {
if (found) { print time; }
found = 0;
time = "(unknown)";
}
And then just use the default record separator of newline. What this does is note the time and reason fields as they get read and then prints out the time at the end of each matching record.
Upvotes: 1
Reputation: 36272
It seems a good solution for me. I would have used the same approach to the problem. I would use break
instead of next
because you want to stop the loop once found the line. The next
instruction has little sense because it executes next cycle of the loop, the same if it wasn't there.
for (i=1;i<=NF;i++) {
if ($i ~ /^Time.*/) {
print $i
break
}
}
Upvotes: 1