Aaron Perry
Aaron Perry

Reputation: 1051

Rounding numerical values with trailing 9s

I'm trying to find the max values in a numeric string and some of the data contains trailing 9s.

999999999999 63 66 69 71 73 75 76 78 80 81 81 80 79 74 67 63999999999999999

I've been using the following command to find the max value of the numbers but, obviously the command sees the data with trailing 9s as the "max" (ex., 6399999....) and ignores the actual max values. Some of the data also contains bad data that is just a bunch of 9s.

grep -Eo '[0-9]+' file_temp | sort -rn | head -n 1 > file_temp_max

How can I get rid of the bad data (ex., 999999) and how can I correct the values with trailing 9s (6399999... > 64) so that they can be rounded (and included) in the data set?

Upvotes: 0

Views: 145

Answers (5)

Adam Katz
Adam Katz

Reputation: 16138

Building from your example code:

grep -Eo '[0-9]+' file_temp | awk '
  $1 ~ /999999999999999/ { sub(/999999999999999$/,""); $1++}
  $0 != 999999999999'

This gets each number on its own line, then uses awk to revise each line. awk examines any line with 15 9s and peels them off, then increments the number. The next line prints anything that isn't eleven nines.

The above assumes 1239999999999999999 should be 1240. If instead it should be 124:

grep -Eo '[0-9]+' file_temp | awk '
  $1 ~ /^999+$/ { next }
  $1 ~ /999$/ { sub(/9+$/,""); $1++}
  { print }'

The first awk line skips lines that are just nines, the second removes all trailing lines and increments the number, the third prints. I'm keying on 3+ nines on the assumption that 9 and 99 are valid.

Upvotes: 0

fedorqui
fedorqui

Reputation: 289665

To "clean" the data, you can do the following by looping through all the fields:

  • If it consists in just 9s, remove it.
  • If it ends with multiple 9s, remove them and increment the remaining number in one.

See it in action with your given input:

$ awk '{for(i=1;i<=NF;i++) {if ($i~/^9+$/) $i=""; if (sub(/9+$/,"",$i)) $i++}}1' a 
 63 66 7 71 73 75 76 78 80 81 81 80 8 74 67 64

Then getting the maximum value is trivial by using any of the algorithms in How to get the biggest number in a file?

Upvotes: 1

glenn jackman
glenn jackman

Reputation: 246807

I'm assuming that "a space followed by 2 digits" is a valid way to extract the numbers you want:

echo 999999999999 63 66 69 71 73 75 76 78 80 81 81 80 79 74 67 63999999999999999 | 
grep -o ' [0-9][0-9]' | 
sort -n | 
tail -1

produces

 81

Upvotes: 0

anubhava
anubhava

Reputation: 785128

You can use this awk:

awk -v RS=' ' '{gsub(/9+$/, ".&", $1); $1=int($1); print $1; if ($1>max) max=$1}
                END{print "max = ", max}' file
0
63
66
6
71
73
75
76
78
80
81
81
80
7
74
67
64
max =  81

gsub(/9+$/, ".&", $1) will insert a decimal point before ending 9s.

$1=int($1) will take integer value from decimal numbers thus rounding them off.

if ($1>max) max=$1 is simple max computation.

Upvotes: 0

mproffitt
mproffitt

Reputation: 2527

This is a slightly different way from Adams answer and uses sed from within a loop.

First off, I'm working on the assumption that you don't know how many 9's will be included. Secondly, I'm using an intermediate conversion to float.

for line in $(cat file_temp); do 
    i=$(echo $line |  sed 's/../.&/;t;s/^.$/.0&/');
    printf "%.02f\n" $i;
done | sed 's/\.//;s/^0//' | sort -nr

Breakdown:

sed 's/../.&/;t;s/^.$/.0&/' add a decimal point after the second character

printf "%.02f\n" $i; print the value as a floating point number - automatically rounds up for you.

sed 's/\.//;s/^0//' strip leading 0 and . leaving just the remaining integer

Upvotes: 1

Related Questions