Reputation: 468
Is it possible to format floating point numbers with sed to extend to equal precision of e.g. 8 digits after the decimal point? Because this problem is part of a bigger context, it is essential to use GNU sed
Input Examples:
0
126
99.0234
.38
-47.88
-234.23001101
40565004.22
The goal is to append trailing 0's to each number until the desired precision is reached. It is sure that no number of the input has more digits after the decimal point as the desired precision. Another (minor) goal is to add a leading zero before the decimal point, if it is missing.
Just the same as bash LC_NUMERIC=en_US.UTF-8 printf "%0.8f" .123
works.
Achieved output is:
0.00000000
126.00000000
99.02340000
0.38000000
-47.88000000
-234.23001101
40565004.22000000
All I found so far is only about limiting the number digits, but not expanding.
An alternative solution would be a way to call the shell command
LC_NUMERIC=en_US.UTF-8 printf "%0.8f\n" \3
for a particular match group (here: the third match group) within the sed processing.
Input: <rowNo>|<date time>|<number>
(example: 1|2024-02-01 00:27:16|.38
, numbers as above)
command (obviously not working, for third match group processing):
sed 's/^\([0-9]*\)|\(.*\)|\([0-9\.]*\)/\1|\2|$(LC_NUMERIC=en_US.UTF-8 printf "%.8f\n" \3))/g' test.csv
Updates in reply to some comments: It is not the point to have exact floating point representations. The numbers in my input have been printed out by a program which is out of my control as (up-to) 8 digit string representations. I added tag 'bash' because I thought it might be helpful.
Upvotes: -1
Views: 221
Reputation: 30951
#!/usr/bin/sed -f
# Add trailing decimal point if absent
/\./!s/$/./
# Add leading zero if no digit before point
/[0-9]\./!s/\./0&/
# Add eight zeroes
s/$/00000000/
# Trim to exactly eight digits after point
s/\(\.[0-9]\{8\}\).*/\1/
Upvotes: 1
Reputation: 10133
This sed command might be what you are looking for:
sed 's/^\(-\{0,1\}\)[.]/\10./
/[.]/!s/$/./
s/$/00000000/
s/\([.].\{8\}\).*/\1/
' input
If the input lines are in the form of <rowNo>|<date time>|<number>
, then
sed 'h
s/.*|//
s/^\(-\{0,1\}\)[.]/\10./
/[.]/!s/$/./
s/$/00000000/
s/\([.].\{8\}\).*/\1/
x
s/[^|]*$//
G
s/\n//
' input
Upvotes: 8
Reputation:
-e allows multiple swaps, as follows:
sed -e 's/^\([^.]*\)$/\1./' -e 's/^\(.*\.\)$/\100000000/' -e 's/^\(.*\..\)$/\10000000/' -e 's/^\(.*\...\)$/\1000000/' -e 's/^\(.*\....\)$/\100000/' -e 's/^\(.*\.....\)$/\10000/' -e 's/^\(.*\......\)$/\1000/' -e 's/^\(.*\.......\)$/\100/' -e 's/^\(.*\........\)$/\10/' test.csv
Upvotes: 0
Reputation: 13491
Probably not the most subtle, but it is pure sed
sed 's/^\([^.]*\)$/\1.00000000/;s/\(\.[0-9]*\)$/\100000000/;s/\(\.........\).*/\1/;s/^\(-*\)\./\10./'
3 commands. The first one add .00000000
if there isn't any .
The second adds 00000000
to ensure we may have too much decimal places, but never too few (it could be simpler without checking for .
, but that is slightly defensive : it wouldn't add the 0
s if, for some reason, there is no .
even after the 1st command)
The third keeps only 8 chars after the .
, if there is one
Upvotes: 3
Reputation: 204477
Regarding:
Because this problem is part of a bigger context, it is essential to use GNU sed
For this problem alone you should not try to use sed and if it's part of a bigger context then you definitely should not use sed. sed is great for a simple s/old/new/ on individual lines but for anything else use awk instead (or perl/ruby/python/etc. if you don't mind tools that aren't mandated to exist by POSIX).
Regarding:
An alternative solution would be a way to call the shell command
LC_NUMERIC=en_US.UTF-8 printf "%0.8f\n" \3
for a particular match group (here: the third match group) within the sed processing.
That would cause sed to spawn a subshell for every number which would be extremely slow and probably have other issues.
Asking for a sed solution given what you've told us so far is similar to asking us to help you shoot yourself in the foot so rather than do that - here's how you can do what you want concisely, robustly, efficiently, easily maintainably and portably using any awk:
$ awk '{printf "%0.8f\n", $0}' file
0.00000000
126.00000000
99.02340000
0.38000000
-47.88000000
-234.23001101
40565004.22000000
I see elsewhere in your question you said the input is actually 3 |
-separated fields (another clue that you should use awk instead of sed since awk has specific language constructs to support fields while sed does not) with the number you want to modify stored in the third field, like this:
$ cat file
1|2024-02-01 00:27:16|0
1|2024-02-01 00:27:16|126
1|2024-02-01 00:27:16|99.0234
1|2024-02-01 00:27:16|.38
1|2024-02-01 00:27:16|-47.88
1|2024-02-01 00:27:16|-234.23001101
1|2024-02-01 00:27:16|40565004.22
in which case:
$ awk -F'[|]' -v OFS='|' '{$3=sprintf("%0.8f", $3)} 1' file
1|2024-02-01 00:27:16|0.00000000
1|2024-02-01 00:27:16|126.00000000
1|2024-02-01 00:27:16|99.02340000
1|2024-02-01 00:27:16|0.38000000
1|2024-02-01 00:27:16|-47.88000000
1|2024-02-01 00:27:16|-234.23001101
1|2024-02-01 00:27:16|40565004.22000000
By the way, you said that printf "%0.8f"
works and here's what that does with a number longer than 8 digits after the decimal point:
$ printf "%0.8f\n" .111111117
0.11111112
Note that the final input digits ...17
are rounded up to ...2
in the output.
If we use input like that with the above awk scripts we get the same output:
$ awk '{printf "%0.8f\n", $0}' <<< .111111117
0.11111112
You would not get the same output from any of the sed scripts posted so far, they'd just truncate to 0.11111111
.
You can see more information on rounding numbers in general at https://www.gnu.org/software/gawk/manual/gawk.html#Round-Function.
Upvotes: 10