Rusty Lemur
Rusty Lemur

Reputation: 1885

Awk efficiently print a matching line from a matching paragraph

I want to write an efficient awk script that will take a file similar to the excerpt shown below and print a certain line (for instance, the line beginning with "Time (UTC):") from each matching record. I believe there's a better way than what I've done in the past to do this.

Example file (sorry I don't know how to put blank lines in the code box. They're represented by "BLANK LINE"):

Processor: Some_Proc
Capsule abortion no 32
Time (UTC): Fri Jun 15 06:25:10 2012
CapsuleId: 1704167
CapsuleName: SomeAppProc
Reason: Assertion "Reason1"  
BLANK LINE
Processor: Some_Proc
Capsule abortion no 33
Time (UTC): Fri Jun 15 06:25:10 2012
CapsuleId: 1704168
CapsuleName: SomeAppProc
Reason: Assertion "Reason2"  
BLANK LINE
Processor: Some_Proc
Capsule abortion no 34
Time (UTC): Fri Jun 15 06:25:10 2012
CapsuleId: 1704168
CapsuleName: SomeAppProc
Reason: Assertion "Reason1"

Previous code example (sorry I don't know how to preserve indentation in this forum I tried 8 spaces, but that didn't work)

BEGIN {
    RS=""  #Each record is a "paragraph"
    FS="\n" #Each field is a line
}

/Reason1/ {
    # print $3  would work if it always shows up on the third line
    # but the following for loop should find it if it's on a different line
    for (i=1;i<=NF;i++) {
        if ($i ~ /^Time.*/) {
            print $i
            next
        }
    }
} 

Is there more efficient way to print the line if it doesn't always occur in the same order?

Thanks

Upvotes: 2

Views: 773

Answers (2)

danfuzz
danfuzz

Reputation: 4353

How about something like this?:

BEGIN { reset(); }
END { reset(); }
$0 == "" { reset(); }
/^Reason:/ && $3 == "\"Reason1\"" { found = 1; }
/^Time \(UTC\):/ { time = $0; }

function reset() {
  if (found) { print time; }
  found = 0;
  time = "(unknown)";
}

And then just use the default record separator of newline. What this does is note the time and reason fields as they get read and then prints out the time at the end of each matching record.

Upvotes: 1

Birei
Birei

Reputation: 36272

It seems a good solution for me. I would have used the same approach to the problem. I would use break instead of next because you want to stop the loop once found the line. The next instruction has little sense because it executes next cycle of the loop, the same if it wasn't there.

for (i=1;i<=NF;i++) {
    if ($i ~ /^Time.*/) {
        print $i
        break
    }
}

Upvotes: 1

Related Questions