Rastus Sazzifrat
Rastus Sazzifrat

Reputation: 21

how to use awk to parse a file with multiple record types

Want to process an input file with awk. There are multiple record types denoted by one of the fields in the incoming records. When the record type contains a specific value, I need to process N number of additional records following the current one that contain data specific to this type of record.

Is this doable with awk?

Here is a sample of the format of the incoming file:

 1 [001:01.0] [ 2] IOCTL 048477589
...
...
28 [002:02.0] [ 2] TX(56)
        81480d0d 0a524141 435a5955 5705243  .H...RAACZYW RC
        43544848 41303033 32203034 3032325  CTHHA0032 04025
        332d4343 43432d2d 52435554 4848412  3-CCCC--RCETHHA

So, basically, when a TX type record is found, read the next N records processing the data, reading in the hex and ASCII equivalent.

??

Upvotes: 2

Views: 570

Answers (1)

danfuzz
danfuzz

Reputation: 4353

It's not entirely clear what you're aiming to do, but here's something that may serve as a start. Replace the print statements with more substantial processing, as needed.

awk '
/^[ 0-9][0-9] / {
    # This is a record header line. Check if it is a TX.
    inTx = ($0 ~ / TX\([0-9]*\)$/);
    if (inTx) {
        print "Start of TX record.";
        next; # Avoid printing the header line below.
    }
}
inTx { print "TX:", $0; }
' file.txt

Here's a somewhat beefier example file, to make it a little more clear what the script does:

 1 [001:01.0] [ 2] IOCTL 048477589
...
...
28 [002:02.0] [ 2] TX(56)
        81480d0d 0a524141 435a5955 5705243  .H...RAACZYW RC
        43544848 41303033 32203034 3032325  CTHHA0032 04025
        332d4343 43432d2d 52435554 4848412  3-CCCC--RCETHHA
 1 [001:01.0] [ 2] IOCTL 048477589
 2 [dsfsdsdf] [ 2] BLORT
29 [002:02.0] [ 2] TX(77)
        abbababa 0a524141 435a5955 5705243  STUFFSTUFFSTUFF
        43544848 bbbbbbbb 32203034 d0d0d0d  CTHULUCTHULUCTH
        332d4343 43432d2d cccccccc 4848412  BLORTZORCHFNORD
 1 [001:01.0] [ 2] IOCTL 048477589
 2 [dsfsdsdf] [ 2] BLORT

Transcript of running the script:

Start of TX record.
TX:         81480d0d 0a524141 435a5955 5705243  .H...RAACZYW RC
TX:         43544848 41303033 32203034 3032325  CTHHA0032 04025
TX:         332d4343 43432d2d 52435554 4848412  3-CCCC--RCETHHA
Start of TX record.
TX:         abbababa 0a524141 435a5955 5705243  STUFFSTUFFSTUFF
TX:         43544848 bbbbbbbb 32203034 d0d0d0d  CTHULUCTHULUCTH
TX:         332d4343 43432d2d cccccccc 4848412  BLORTZORCHFNORD

Upvotes: 2

Related Questions