jimmax777
jimmax777

Reputation: 11

awk to print header and lines matching specific criteria

I am printing iostat or similar output for instance:

[/] # iostat -xnCT d 5 5
Tue Nov 25 13:45:56 2014
                    extended device statistics
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    3.1   0   0 c0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    3.1   0   0 c0t0d0
    0.1    2.7    1.6    4.8  0.0  0.0    0.1    433.2   0   0 c1
    0.1    2.7    1.5    4.8  0.0  0.0    0.1    3.3   0   0 c1t0d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    100.1   0   0 c1t1d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c1t2d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.5   0   0 c1t3d0
    0.1    0.1    0.1    0.0  0.0  0.0    0.0    600.0   0   0 c2
    0.0    0.0    0.0    0.0  0.0  0.0    185.0    0.0   0   0 c2t0d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c2t1d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.1   0   0 c2t4d0
    0.0    0.0    0.0    0.0  0.0  0.0    295.0    0.0   0   0 c2t5d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.1   0   0 c2t6d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c2t8d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c2t9d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c2t10d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.1   0   0 c2t11d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c2t12d0
    0.1    0.1    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c3
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c3t0d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c3t1d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c3t2d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c3t3d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c3t4d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c3t5d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c3t6d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c3t8d0

I generate these logs and then I use scripts to print out information. Now I am looking into some performance scripts where I need to for instance, grep for values that have high average service times from the log files which are generated all the time and see for instance values greater than 100.

So here is what I can do:

awk '$7 > 100 || $8 > 100' filename

So I will get all entries which have values of the wsvc_t and asvc_t greater than 100. Note, this is just an example. However, I also want to print the Date when this occurred, which cannot be done using grep -B or I am not sure how to use sed or awk to do this since the number of lines before the entry is not going to be a fixed one.

So is there an easy way to do this where I can print the lines with the values greater than 100 for $7 or $8 and then print the line which has 2014 or the year in it above the entry found?

So my result should be something like:

Tue Nov 25 13:45:56 2014
                    extended device statistics
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    0.1    2.7    1.6    4.8  0.0  0.0    0.1    433.2   0   0 c1
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    100.1   0   0 c1t1d0
    0.1    0.1    0.1    0.0  0.0  0.0    0.0    600.0   0   0 c2
    0.0    0.0    0.0    0.0  0.0  0.0    185.0    0.0   0   0 c2t0d0
    0.0    0.0    0.0    0.0  0.0  0.0    295.0    0.0   0   0 c2t5d0

The log files will be running in thousands of lines.

Upvotes: 1

Views: 2937

Answers (2)

fedorqui
fedorqui

Reputation: 290415

I would use the following:

awk 'NR<=3 || $7 > 100 || $8 > 100'

This will print lines matching either of these conditions:

  • NR<=3. NR stands for Number of Record, which in general is number of line. Thus, we are looking for the line number to be lower or equal to 3 (to print the header).
  • 7th field strictly bigger than 100. This is what you already had.
  • 8th field strictly bigger than 100. This is what you already had.

So the only thing I added to your current script is the NR<=3, which is quite useful when know exactly the line number we want to print, like now.

Test

With your given input stored as a file:

$ awk 'NR<=3 || $7 > 100 || $8 > 100' file
Tue Nov 25 13:45:56 2014
                    extended device statistics
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    0.1    2.7    1.6    4.8  0.0  0.0    0.1    433.2   0   0 c1
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    100.1   0   0 c1t1d0
    0.1    0.1    0.1    0.0  0.0  0.0    0.0    600.0   0   0 c2
    0.0    0.0    0.0    0.0  0.0  0.0    185.0    0.0   0   0 c2t0d0
    0.0    0.0    0.0    0.0  0.0  0.0    295.0    0.0   0   0 c2t5d0

Upvotes: 2

user1646075
user1646075

Reputation:

You need to capture the interesting header data into variables as they pass by. Then, you need to choose to print the headers once when a bunch of interesting lines are seen. Finally, you need to reset the fact that the headers have been printed each time a new date is seen.

What is your preferred language? I'm not up-to-date on awk's current capabilities regarding patterns; I stopped caring when I started using Perl. So here's some Perl code I've banged up without testing:

source-process-feeding-lines | perl -n -e '
    if(/^(\w+ \w+ \d+ \d+:\d+:\d+ \d+)$/) {
        $date = $1;
        $header1 = $header2 = $printed = undef;    # reset heading state
        continue;    # next line please
    }

    if(/extended device statistics/) {
        $header1 = $_;
        continue;
    }

    if(/^(\s*\w\/\w.*device)$/ {   # simple but probably sufficient recogniser
        $header2 = $_;
        continue;
    }

    # assume a data line here
    if(/your pattern for an interesting line/) {
        if(! $printed) {
            $printed = 1;     # prevent a 2nd printing unless the date changes
            print $date, $header1, $header2;
        }

        print;    # print your interesting line
    }
'

This is close enough to what I think you're asking. Debugging probably needs applying!

Upvotes: 1

Related Questions