404Cat
404Cat

Reputation: 143

Extracting text file information via command line/script

I'd like to extract only certain information from a block of text. I have had great luck with asking the StackOverflow community for their expertise assistance, especially with tricky topics (RegEx, perl, sed, awk).

The text is output from a tshark command that I would like to manipulate and print out to avoid unnecessary information.

Any help would be appreciated. I am currently learning the ways of the aforementioned topics, but it's slow going!

Any script or command help to achieve the following output will be seriously helpful.

Original:

                                                     Host 1            Host 2            Total            Relative         Duration
Host 1                   Host 2                Frames     Bytes  Frames     Bytes  Frames     Bytes        Start
192.168.0.14         <-> 192.168.0.13            3898   4872033    1971    120545    5869   4992578     0.001886000       283.6363
192.168.0.162        <-> 192.168.0.71               2      1992       2      1992       4      3984   176.765198000        77.0542
192.168.0.191        <-> 192.168.0.150              3      2988       0         0       3      2988   199.319020000        59.7055
192.168.0.227        <-> 192.168.0.157              3      2988       0         0       3      2988   197.013283000        76.7197
192.168.0.221        <-> 192.168.0.94               3      2988       0         0       3      2988   196.312847000        59.7065
192.168.0.75         <-> 192.168.0.58               2      1992       1       996       3      2988   191.995706000        59.7121
224.0.0.252          <-> 192.168.0.13               3       207       0         0       3       207   180.521299000         0.0536
192.168.0.191        <-> 192.168.0.50               1       996       2      1992       3      2988   173.452130000        59.6849
192.168.0.41         <-> 192.168.0.13               3      2988       0         0       3      2988   167.180087000        76.6960
192.168.0.206        <-> 192.168.0.153              1       996       1       996       2      1992   270.528070000         4.4070

Desired:

Host 1     Host 2     Total Bytes
x.x.x.x    x.x.x.x    N
x.x.x.x    x.x.x.x    N
x.x.x.x    x.x.x.x    N

Upvotes: 1

Views: 99

Answers (3)

beasy
beasy

Reputation: 1227

in perl:

tshark | perl -lane 'print join "\t", ($F[0], $F[2], $F[8])'

the -a option splits each line of stdin into an array called @F. the column numbers don't correspond well to the array index numbers because -a splits by space by default. you can set the delimiter with -F if you like.

-F would help get the headers aligned correctly too, but to just skip the misaligned headers, add next if $. < 3; before print to skip the first two lines

Upvotes: 2

guido
guido

Reputation: 19194

Given your output is in filename:

sed 's/ \+/ /g' filename | tail -n +3 | cut -f1,3,9 -d ' ' | sed 's/ /\t/g' | sort -r -n -k3
  • replace multiple spaces with a single one, for tokenizing
  • discard the first two header lines
  • project columns 1, 3, and 9
  • replace spaces with tabs to have columns back
  • sort desc by total bytes

output:

192.168.0.14    192.168.0.13    4992578
192.168.0.162   192.168.0.71    3984
192.168.0.75    192.168.0.58    2988
192.168.0.41    192.168.0.13    2988
192.168.0.227   192.168.0.157   2988
192.168.0.221   192.168.0.94    2988
192.168.0.191   192.168.0.50    2988
192.168.0.191   192.168.0.150   2988
192.168.0.206   192.168.0.153   1992
224.0.0.252     192.168.0.13    207

Upvotes: 1

mklement0
mklement0

Reputation: 438073

Try:

awk '
 BEGIN { printf "%-15s %-15s %s\n",  "Host 1", "Host 2", "Total Bytes" }
 NR>2  { printf "%-15s %-15s %11s\n", $1, $3, $9 }
' file

Adjust the output-field widths as needed.

  • The BEGIN block is used to print the output header line.
  • NR > 2 ensures that the input header lines are skipped.
  • printf is used with field-width specifiers create column-aligned output.
    • a - before the width specifier indicates left-aligned output (e.g.,%-15s; without it, the value is right-aligned (e.g., %11s)

Upvotes: 2

Related Questions