Shervin
Shervin

Reputation: 409

How can I get the length of each output line of grep

I am very new to bash scripting. I have a network trace file I want to parse. Part of the trace file is (two packets):

    [continues...]
    +---------+---------------+----------+
    05:00:00,727,744   ETHER
    |0  
    |00|03|a0|09|5c|1c|00|10|07|df|a4|20|08|00|45|00|00|38|e7|55|

    +---------+---------------+----------+
    05:00:00,727,751   ETHER
    |0  
    |00|03|a0|09|5c|1c|00|10|07|df|a4|20|08|00|45|00|00|38|e7|56|00|00|3a|01|

    [continues...]

For each packet, I want to print the time stamp, and the length of the packet (the hex values coming on the next line after |0 header) so the output will look like:

    05:00:00.727744 20 bytes
    05:00:00.727751 24 bytes

I can get the line with time stamp and the packets separately using grep in bash:

times=$(grep  '..\:..\:' $fileName)
packets=$(grep  '..|..|' $fileName)

But I can't work with the separate output lines after that. The whole result is concatenated in the two variables "times" and "packets". How can I get the length of each packet?

P.S. a good reference that really explains how to do bash programming, rather than just doing examples would be appreciated.

Upvotes: 2

Views: 3721

Answers (2)

David W.
David W.

Reputation: 107090

Okay, with plain old shell...

You can get the length of the line like this:

line="|00|03|a0|09|5c|1c|00|10|07|df|a4|20|08|00|45|00|00|38|e7|55|"
wc -c<<<$line
62

There are sixty two characters in that line. Think of each character as |00 where 00 can be any digit. In that case, there's an extra | on the end. Plus, the wc -c includes the NL on the end.

So, if we take the value of wc -c, and subtract 2, we get 60. If we divide that by 3, we get 20 which is the number of characters.

Okay, now we need a little loop, figure out the various lines, and then parse them:

#! /bin/bash

while read line
do
    if [[ $line =~ ^[[:digit:]]{2} ]]
    then
        echo -n "${line% *}"
    elif [[ $line =~ ^\|[[:digit:]]{2} ]]
    then
        length=$(wc -c<<<$line)
        ((length-=2))
        ((length=length/3))
        echo "$length bytes"
    fi
done < test.txt

There a PURE BASH solution to your problems!

You're a beginning Bash programmer, and you have no idea what's going on...

Let's take this one step at a time:

A common way to loop through a file in BASH is using a while read loop. This combines the while with a read:

while read line
do
   echo "My line is '$line'"
done < test.txt

Each line in test.txt is being read into the $line shell variable.

Let's take the next one:

if [[ $line =~ ^[[:digit:]]{2} ]]

This is an if statement. Always use the [[ ... ]] brackets because they fix issues with the shell interpolating stuff. Plus, they have a bit more power.

The =~ is a regular expression match. The [[:digit:]] matches any digit. The ^ anchors the regular expression to the beginning of the line, and {2} means I want exactly two of these. This says if I match a line that starts with two digits (which is your timestamp line), execute this if clause.

${line% *} is a pattern filter. The % says to match the (glob) smallest glob pattern to the right and filter it from my $line variable. I use this to remove the ETHER from my line. The -n tells echo not to do a NL.

Let's take my elif which is an else if clause.

elif [[ $line =~ ^\|[[:digit:]]{2} ]]

Again, I am matching a regular expression. This regular expression starts with (The ^) a |. I have to put a backslash in front because | is a magical regular expression character and \ kills the magic. It's now just a pipe. Then, that's followed by two digits. Note this skips |0 but catches |00.

Now, we have to do some calculations:

length=$(wc -c<<<$line)

The $(...) say to execute the enclosed command and resubstitute it back in the line. The wc -c counts the characters and <<<$line is what we're counting. This gave us 62 characters. We have to subtract 2, then divide by 3. That's the next two lines:

((length-=2))
((length/=3))

The ((...)) allows me to do integer based math. The first subtracts 2 from $length and the next divides it by 3. Now, I can echo this out:

echo "$length bytes"

And that's our pure Bash answer to this question.

Upvotes: 2

michas
michas

Reputation: 26555

You really don't want to do such things with your shell.

You want to write a real parser that understands the format to output the needed informations.

For a quick and dirty hack you can do something like that:

perl -wne 'print "$& " if /^\d\S*/; print split(/\|/)-2, " bytes\n" if /^\|..\|/'

Upvotes: 1

Related Questions