Zaphod Beeblebrox
Zaphod Beeblebrox

Reputation: 468

Format floating-point numbers with sed to equal precision - add trailing zeros

Is it possible to format floating point numbers with sed to extend to equal precision of e.g. 8 digits after the decimal point? Because this problem is part of a bigger context, it is essential to use GNU sed

Input Examples:

0
126
99.0234
.38
-47.88
-234.23001101
40565004.22

The goal is to append trailing 0's to each number until the desired precision is reached. It is sure that no number of the input has more digits after the decimal point as the desired precision. Another (minor) goal is to add a leading zero before the decimal point, if it is missing.

Just the same as bash LC_NUMERIC=en_US.UTF-8 printf "%0.8f" .123 works.

Achieved output is:

0.00000000
126.00000000
99.02340000
0.38000000
-47.88000000
-234.23001101
40565004.22000000

All I found so far is only about limiting the number digits, but not expanding.


An alternative solution would be a way to call the shell command LC_NUMERIC=en_US.UTF-8 printf "%0.8f\n" \3 for a particular match group (here: the third match group) within the sed processing.

Input: <rowNo>|<date time>|<number> (example: 1|2024-02-01 00:27:16|.38, numbers as above)

command (obviously not working, for third match group processing):

sed 's/^\([0-9]*\)|\(.*\)|\([0-9\.]*\)/\1|\2|$(LC_NUMERIC=en_US.UTF-8 printf "%.8f\n" \3))/g' test.csv

Updates in reply to some comments: It is not the point to have exact floating point representations. The numbers in my input have been printed out by a program which is out of my control as (up-to) 8 digit string representations. I added tag 'bash' because I thought it might be helpful.

Upvotes: -1

Views: 221

Answers (5)

Toby Speight
Toby Speight

Reputation: 30951

#!/usr/bin/sed -f

# Add trailing decimal point if absent
/\./!s/$/./
# Add leading zero if no digit before point
/[0-9]\./!s/\./0&/
# Add eight zeroes
s/$/00000000/
# Trim to exactly eight digits after point
s/\(\.[0-9]\{8\}\).*/\1/

Upvotes: 1

M. Nejat Aydin
M. Nejat Aydin

Reputation: 10133

This sed command might be what you are looking for:

sed 's/^\(-\{0,1\}\)[.]/\10./
     /[.]/!s/$/./
     s/$/00000000/
     s/\([.].\{8\}\).*/\1/
' input

If the input lines are in the form of <rowNo>|<date time>|<number>, then

sed 'h
     s/.*|//
     s/^\(-\{0,1\}\)[.]/\10./
     /[.]/!s/$/./
     s/$/00000000/
     s/\([.].\{8\}\).*/\1/
     x
     s/[^|]*$//
     G
     s/\n//
' input

Upvotes: 8

user11844224
user11844224

Reputation:

-e allows multiple swaps, as follows:
sed -e 's/^\([^.]*\)$/\1./' -e 's/^\(.*\.\)$/\100000000/' -e 's/^\(.*\..\)$/\10000000/' -e 's/^\(.*\...\)$/\1000000/' -e 's/^\(.*\....\)$/\100000/' -e 's/^\(.*\.....\)$/\10000/' -e 's/^\(.*\......\)$/\1000/' -e 's/^\(.*\.......\)$/\100/' -e 's/^\(.*\........\)$/\10/' test.csv

Upvotes: 0

chrslg
chrslg

Reputation: 13491

Probably not the most subtle, but it is pure sed

sed 's/^\([^.]*\)$/\1.00000000/;s/\(\.[0-9]*\)$/\100000000/;s/\(\.........\).*/\1/;s/^\(-*\)\./\10./'

3 commands. The first one add .00000000 if there isn't any .
The second adds 00000000 to ensure we may have too much decimal places, but never too few (it could be simpler without checking for ., but that is slightly defensive : it wouldn't add the 0s if, for some reason, there is no . even after the 1st command)
The third keeps only 8 chars after the ., if there is one

Upvotes: 3

Ed Morton
Ed Morton

Reputation: 204477

Regarding:

Because this problem is part of a bigger context, it is essential to use GNU sed

For this problem alone you should not try to use sed and if it's part of a bigger context then you definitely should not use sed. sed is great for a simple s/old/new/ on individual lines but for anything else use awk instead (or perl/ruby/python/etc. if you don't mind tools that aren't mandated to exist by POSIX).

Regarding:

An alternative solution would be a way to call the shell command LC_NUMERIC=en_US.UTF-8 printf "%0.8f\n" \3 for a particular match group (here: the third match group) within the sed processing.

That would cause sed to spawn a subshell for every number which would be extremely slow and probably have other issues.

Asking for a sed solution given what you've told us so far is similar to asking us to help you shoot yourself in the foot so rather than do that - here's how you can do what you want concisely, robustly, efficiently, easily maintainably and portably using any awk:

$ awk '{printf "%0.8f\n", $0}' file
0.00000000
126.00000000
99.02340000
0.38000000
-47.88000000
-234.23001101
40565004.22000000

I see elsewhere in your question you said the input is actually 3 |-separated fields (another clue that you should use awk instead of sed since awk has specific language constructs to support fields while sed does not) with the number you want to modify stored in the third field, like this:

$ cat file
1|2024-02-01 00:27:16|0
1|2024-02-01 00:27:16|126
1|2024-02-01 00:27:16|99.0234
1|2024-02-01 00:27:16|.38
1|2024-02-01 00:27:16|-47.88
1|2024-02-01 00:27:16|-234.23001101
1|2024-02-01 00:27:16|40565004.22

in which case:

$ awk -F'[|]' -v OFS='|' '{$3=sprintf("%0.8f", $3)} 1' file
1|2024-02-01 00:27:16|0.00000000
1|2024-02-01 00:27:16|126.00000000
1|2024-02-01 00:27:16|99.02340000
1|2024-02-01 00:27:16|0.38000000
1|2024-02-01 00:27:16|-47.88000000
1|2024-02-01 00:27:16|-234.23001101
1|2024-02-01 00:27:16|40565004.22000000

By the way, you said that printf "%0.8f" works and here's what that does with a number longer than 8 digits after the decimal point:

$ printf "%0.8f\n" .111111117
0.11111112

Note that the final input digits ...17 are rounded up to ...2 in the output.

If we use input like that with the above awk scripts we get the same output:

$ awk '{printf "%0.8f\n", $0}' <<< .111111117
0.11111112

You would not get the same output from any of the sed scripts posted so far, they'd just truncate to 0.11111111.

You can see more information on rounding numbers in general at https://www.gnu.org/software/gawk/manual/gawk.html#Round-Function.

Upvotes: 10

Related Questions