Typedragon33
Typedragon33

Reputation: 139

How to extract specific data from grep command in bash?

I'm trying to write my own script to tell me if I've used more than 500 MiB of my data. I'm using vnstat -d for the information about data usage. vnstat -d Output here enter image description here

Output should be:

  1. Only from the "Total column"
  2. Only have values greater than 500.

I want only values from the "total"column. My output lists data from all the columns. Better clear from the following:

#!/bin/bash
for i in `vnstat -d | grep -a [0-9] `; //get numerical values in i (-a tag as vnstat outputs in binary)
do 
    NUMBER=$(echo $i | grep -o '[5-9][0-9][0-9]'); //store values >500 in a var called NUMBER
    echo $NUMBER;
done;

I'm a self-learning newb here so please try not to bash (pun) me.

Current output which I'm receiving from above script:

600
654
925
884
923
871
967
868

My desired output should be:

654   
923    
967

Upvotes: 0

Views: 1920

Answers (5)

macaque
macaque

Reputation: 13

vnstat has several options to format the output. You can use vnstat --dumpdb, vnstat --json or vnstat --xml to have well-formatted data that you can then parse more easily (for example with jq if you choose the JSON format).

For example :

vnstat --json | jq '.interfaces[] | select(.id == "eth0") | .traffic | .days[1] | .rx'

will extract the number of kiB received on the interface eth0 yesterday (the day 0 is today, 1 is yesterday, etc)

To have the total rx+tx, you can use

vnstat --json | jq '.interfaces[] | select(.id == "eth0") | .traffic | .total | .rx+.tx'

You can also sum several days, for example today and yesterday :

vnstat --json | jq '.interfaces[] | select(.id == "eth0") | .traffic | [.days[0,1] | .rx+.tx] | add'

And instead of days, you can references "months" or "hours" (for hours, be careful, the id has not the same meaning, it's the reference of the hour).

Upvotes: 0

thanasisp
thanasisp

Reputation: 5965

You want to parse a pipe delimited table and check only a specific column, there are tools more appropriate than grep for this job, for example you could write a small bash script where you use the cut command to extract the data and process them, or awk.

Here is a solution with awk. We print numbers > 500 of that column, total. Send your command output to

awk -F "|" '($3+0>=500){print $3}'
  • -F sets the field delimiter to |
  • $3+0 is used to convert a string starting with a number to that number, so that we can handle it as a number and do the comparison.


Now, if you really want to extract all values having column total > 500 MiB, then the expected output should include all values expressed in GiB, as they are > 1000 MiB, for example the minimum value in your evil screenshot is 0.98 GiB which is 1003 MiB. So we can add this to the first condition.

awk -F "|" '($3 ~ /GiB/ || $3+0>500){print $3}'


Now if you want the output to be only integers in MiB, we can modify it to:

awk -F "|" '($3 ~ /GiB/){$3=1024*$3+0} ($3+0>500){printf "%.0f\n",$3}'

Here we convert all GiB values to MiB, and we do the comparison after that.

Upvotes: 1

Dat Huynh
Dat Huynh

Reputation: 88

#!/bin/bash
IFS=$'\n'
for i in `vnstat -d`; do # get each lines
  VALUE=$(echo $i | cut -d\| -f3) # get total value with unit, in case you want to check for GiB values
  NUMBER=$(echo $VALUE | grep -o '[0-9.*]' | cut -d. -f1); # split the string by '|', get the number part, store the integer part into NUMBER
  if [[ $NUMBER -ge 500 && "$VALUE" == *"MiB"* || "$VALUE" == *"GiB"* ]]; then # if the number is greater than or equals to 500 OR it's in GiB
    echo $VALUE; # echo the value
  fi
done

Of course you can strip out the GiB checking if you wanted to.

Edit: Added IFS=$'\n' at the beginning. This allows the for loop to use endline as the delimiter.

Upvotes: 0

Ron
Ron

Reputation: 6551

Simplified:

#/bin/bash
if [[ $(( $(vnstat -d --oneline|cut -d';' -f6|cut -d. -f1|paste -sd '+') )) -ge 500 ]];then
  echo 500 Mb reached
fi

(What the script does, is it takes the specified field from the oneliner CSV-like output from each interface, then cuts the whole numbers and does a SUM of them. And then it compares if that sum is equal or greater than 500. And if it is, then it outputs a message)

Note:

-f6 will parse the "total for today" traffic

you can replace it with:

-f4 = rx for today

-f5 = tx for today

Upvotes: 1

Shawn
Shawn

Reputation: 52334

I'd use awk. Something like (untested)

vnstat -d | awk '$1 == "estimated" { exit }
                 ($9 == "GiB" && $8 > 0.5) ||
                 ($9 == "MiB" && $8 > 500) { print $8 " " $9 }'

Upvotes: 0

Related Questions