dmon
dmon

Reputation: 1422

Rounding to significant figures in numpy

I've tried searching this and can't find a satisfactory answer.

I want to take a list/array of numbers and round them all to n significant figures. I have written a function to do this, but I was wondering if there is a standard method for this? I've searched but can't find it. Example:

In:  [  0.0, -1.2366e22, 1.2544444e-15, 0.001222 ], n=2
Out: [ 0.00, -1.24e22,        1.25e-15,  1.22e-3 ]

Thanks

Upvotes: 41

Views: 60536

Answers (13)

Sean Lake
Sean Lake

Reputation: 608

First a criticism: you're counting the number of significant figures wrong. In your example you want n=3, not 2.

It is possible to get around most of the edge cases by letting numpy library functions handle them if you use the function that makes the binary version of this algorithm simple: frexp. As a bonus, this algorithm will also run much faster because it never calls the log function.

#The following constant was computed in maxima 5.35.1 using 64 bigfloat digits of precision
__logBase10of2 = 3.010299956639811952137388947244930267681898814621085413104274611e-1

import numpy as np

def RoundToSigFigs_fp( x, sigfigs ):
"""
Rounds the value(s) in x to the number of significant figures in sigfigs.
Return value has the same type as x.

Restrictions:
sigfigs must be an integer type and store a positive value.
x must be a real value or an array like object containing only real values.
"""
if not ( type(sigfigs) is int or type(sigfigs) is long or
         isinstance(sigfigs, np.integer) ):
    raise TypeError( "RoundToSigFigs_fp: sigfigs must be an integer." )

if sigfigs <= 0:
    raise ValueError( "RoundToSigFigs_fp: sigfigs must be positive." )

if not np.all(np.isreal( x )):
    raise TypeError( "RoundToSigFigs_fp: all x must be real." )

#temporarily suppres floating point errors
errhanddict = np.geterr()
np.seterr(all="ignore")

matrixflag = False
if isinstance(x, np.matrix): #Convert matrices to arrays
    matrixflag = True
    x = np.asarray(x)

xsgn = np.sign(x)
absx = xsgn * x
mantissas, binaryExponents = np.frexp( absx )

decimalExponents = __logBase10of2 * binaryExponents
omags = np.floor(decimalExponents)

mantissas *= 10.0**(decimalExponents - omags)

if type(mantissas) is float or isinstance(mantissas, np.floating):
    if mantissas < 1.0:
        mantissas *= 10.0
        omags -= 1.0
        
else: #elif np.all(np.isreal( mantissas )):
    fixmsk = mantissas < 1.0, 
    mantissas[fixmsk] *= 10.0
    omags[fixmsk] -= 1.0

result = xsgn * np.around( mantissas, decimals=sigfigs - 1 ) * 10.0**omags
if matrixflag:
    result = np.matrix(result, copy=False)

np.seterr(**errhanddict)
return result

And it handles all of your cases correctly, including infinite, nan, 0.0, and a subnormal number:

>>> eglist = [  0.0, -1.2366e22, 1.2544444e-15, 0.001222, 0.0, 
...        float("nan"), float("inf"), float.fromhex("0x4.23p-1028"), 
...        0.5555, 1.5444, 1.72340, 1.256e-15, 10.555555  ]
>>> eglist
[0.0, -1.2366e+22, 1.2544444e-15, 0.001222, 0.0, 
nan, inf, 1.438203867284623e-309, 
0.5555, 1.5444, 1.7234, 1.256e-15, 10.555555]
>>> RoundToSigFigs(eglist, 3)
array([  0.00000000e+000,  -1.24000000e+022,   1.25000000e-015,
         1.22000000e-003,   0.00000000e+000,               nan,
                     inf,   1.44000000e-309,   5.56000000e-001,
         1.54000000e+000,   1.72000000e+000,   1.26000000e-015,
         1.06000000e+001])
>>> RoundToSigFigs(eglist, 1)
array([  0.00000000e+000,  -1.00000000e+022,   1.00000000e-015,
         1.00000000e-003,   0.00000000e+000,               nan,
                     inf,   1.00000000e-309,   6.00000000e-001,
         2.00000000e+000,   2.00000000e+000,   1.00000000e-015,
         1.00000000e+001])

Edit: 2016/10/12 I found an edge case that the original code handled wrong. I have placed a fuller version of the code in a GitHub repository.

Edit: 2019/03/01 Replace with recoded version.

Edit: 2020/11/19 Replace with vectorized version from Github that handles arrays. Note that preserving input data types, where possible, was also a goal of this code.

Upvotes: 13

timmey
timmey

Reputation: 125

One more solution which works well. Doing the test from @ScottGigante, it would be second best with a timing of 1.75ms.

import math

def sig_dig(x, n_sig_dig = 5):
  num_of_digits = len(str(x).replace(".", ""))
  if n_sig_dig >= num_of_digits:
      return x
  n = math.floor(math.log10(abs(x)) + 1 - n_sig_dig)
  result = round(x * 10**(-n)) * 10**n
  return result

And if it should be applied also to list/arrays you can vectorize it as

sig_dig_vec = np.vectorize(sig_dig)

Credit: answer inspired by this post

Upvotes: 0

Dom McEwen
Dom McEwen

Reputation: 421

Here is a version of Autumns answer which is vectorized so it can be applied to an array of floats not just a single float.

x = np.array([12345.6, 12.5673])
def sf4(x):
    x = float(np.format_float_positional(x, precision=4, unique=False, fractional=False,trim='k'))
    return x
vec_sf4 = np.vectorize(sf4)

vec_sf4(x)

>>>np.array([12350., 12.57])

Upvotes: 2

amzon-ex
amzon-ex

Reputation: 1744

For (display-) formatting in exponential notation, numpy.format_float_scientific(x, precision = n) (where x is the number to be formatted) seems to work well. The method returns a string. (This is similar to @Autumn's answer)

Here is an example:

>>> x = 7.92398e+05
>>> print(numpy.format_float_scientific(x, precision = 3))
7.924e+05

Here, the argument precision = n fixes the number of decimals in the mantissa (by rounding off). It is possible to re-convert back this to float type...and that would obviously keep only the digits present in the string. It would be converted to a positional float format though... more work would be required - so I guess the re-conversion is probably quite a bad idea for large set of numbers.

Also, this doesn't work with iterables...look the docs up for more info.

Upvotes: 0

A. West
A. West

Reputation: 684

For Scalars

sround = lambda x,p: float(f'%.{p-1}e'%x)

Example

>>> print( sround(123.45, 2) )
120.0

For Arrays

Use Scott Gigante's signif(x, p) fig1 fig2

Upvotes: 0

Scott Gigante
Scott Gigante

Reputation: 1644

Testing all of the already proposed solutions, I find they either

  1. convert to and from strings, which is inefficient
  2. can't handle negative numbers
  3. can't handle arrays
  4. have some numerical errors.

Here's my attempt at a solution which should handle all of these things. (Edit 2020-03-18: added np.asarray as suggested by A. West.)

def signif(x, p):
    x = np.asarray(x)
    x_positive = np.where(np.isfinite(x) & (x != 0), np.abs(x), 10**(p-1))
    mags = 10 ** (p - 1 - np.floor(np.log10(x_positive)))
    return np.round(x * mags) / mags

Testing:

def scottgigante(x, p):
    x_positive = np.where(np.isfinite(x) & (x != 0), np.abs(x), 10**(p-1))
    mags = 10 ** (p - 1 - np.floor(np.log10(x_positive)))
    return np.round(x * mags) / mags

def awest(x,p):
    return float(f'%.{p-1}e'%x)

def denizb(x,p):
    return float(('%.' + str(p-1) + 'e') % x)

def autumn(x, p):
    return np.format_float_positional(x, precision=p, unique=False, fractional=False, trim='k')

def greg(x, p):
    return round(x, -int(np.floor(np.sign(x) * np.log10(abs(x)))) + p-1)

def user11336338(x, p):         
    xr = (np.floor(np.log10(np.abs(x)))).astype(int)
    xr=10.**xr*np.around(x/10.**xr,p-1)   
    return xr

def dmon(x, p):
    if np.all(np.isfinite(x)):
        eset = np.seterr(all='ignore')
        mags = 10.0**np.floor(np.log10(np.abs(x)))  # omag's
        x = np.around(x/mags,p-1)*mags             # round(val/omag)*omag
        np.seterr(**eset)
        x = np.where(np.isnan(x), 0.0, x)           # 0.0 -> nan -> 0.0
    return x

def seanlake(x, p):
    __logBase10of2 = 3.010299956639811952137388947244930267681898814621085413104274611e-1
    xsgn = np.sign(x)
    absx = xsgn * x
    mantissa, binaryExponent = np.frexp( absx )

    decimalExponent = __logBase10of2 * binaryExponent
    omag = np.floor(decimalExponent)

    mantissa *= 10.0**(decimalExponent - omag)

    if mantissa < 1.0:
        mantissa *= 10.0
        omag -= 1.0

    return xsgn * np.around( mantissa, decimals=p - 1 ) * 10.0**omag

solns = [scottgigante, awest, denizb, autumn, greg, user11336338, dmon, seanlake]

xs = [
    1.114, # positive, round down
    1.115, # positive, round up
    -1.114, # negative
    1.114e-30, # extremely small
    1.114e30, # extremely large
    0, # zero
    float('inf'), # infinite
    [1.114, 1.115e-30], # array input
]
p = 3

print('input:', xs)
for soln in solns:
    print(f'{soln.__name__}', end=': ')
    for x in xs:
        try:
            print(soln(x, p), end=', ')
        except Exception as e:
            print(type(e).__name__, end=', ')
    print()

Results:

input: [1.114, 1.115, -1.114, 1.114e-30, 1.114e+30, 0, inf, [1.114, 1.115e-30]]
scottgigante: 1.11, 1.12, -1.11, 1.11e-30, 1.11e+30, 0.0, inf, [1.11e+00 1.12e-30], 
awest: 1.11, 1.11, -1.11, 1.11e-30, 1.11e+30, 0.0, inf, TypeError, 
denizb: 1.11, 1.11, -1.11, 1.11e-30, 1.11e+30, 0.0, inf, TypeError, 
autumn: 1.11, 1.11, -1.11, 0.00000000000000000000000000000111, 1110000000000000000000000000000., 0.00, inf, TypeError, 
greg: 1.11, 1.11, -1.114, 1.11e-30, 1.11e+30, ValueError, OverflowError, TypeError, 
user11336338: 1.11, 1.12, -1.11, 1.1100000000000002e-30, 1.1100000000000001e+30, nan, nan, [1.11e+00 1.12e-30], 
dmon: 1.11, 1.12, -1.11, 1.1100000000000002e-30, 1.1100000000000001e+30, 0.0, inf, [1.11e+00 1.12e-30], 
seanlake: 1.11, 1.12, -1.11, 1.1100000000000002e-30, 1.1100000000000001e+30, 0.0, inf, ValueError, 

Timing:

def test_soln(soln):
    try:
        soln(np.linspace(1, 100, 1000), 3)
    except Exception:
        [soln(x, 3) for x in np.linspace(1, 100, 1000)]

for soln in solns:
    print(soln.__name__)
    %timeit test_soln(soln)

Results:

scottgigante
135 µs ± 15.3 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
awest
2.23 ms ± 430 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
denizb
2.18 ms ± 352 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
autumn
2.92 ms ± 206 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
greg
14.1 ms ± 1.21 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)
user11336338
157 µs ± 50.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
dmon
142 µs ± 8.52 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
seanlake
20.7 ms ± 994 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

Upvotes: 25

user11336338
user11336338

Reputation: 1

I like Greg's very short effective routine above. However, it suffers from two drawbacks. One is that it doesn't work for x<0, not for me anyway. (That np.sign(x) should be removed.) Another is that it does not work if x is an array. I've fixed both of those problems with the routine below. Notice that I've changed the definition of n.

import numpy as np

def Round_n_sig_dig(x, n):

    xr = (np.floor(np.log10(np.abs(x)))).astype(int)
    xr=10.**xr*np.around(x/10.**xr,n-1)   
    return xr    

Upvotes: 0

dmon
dmon

Reputation: 1422

Okay, so reasonably safe to say this is not allowed for in standard functionality. To close this off then, this is my attempt at a robust solution. It's rather ugly/non-pythonic and prob illustrates better then anything why I asked this question, so please feel free to correct or beat :)

import numpy as np

def round2SignifFigs(vals,n):
    """
    (list, int) -> numpy array
    (numpy array, int) -> numpy array

    In: a list/array of values
    Out: array of values rounded to n significant figures

    Does not accept: inf, nan, complex

    >>> m = [0.0, -1.2366e22, 1.2544444e-15, 0.001222]
    >>> round2SignifFigs(m,2)
    array([  0.00e+00,  -1.24e+22,   1.25e-15,   1.22e-03])
    """

    if np.all(np.isfinite(vals)) and np.all(np.isreal((vals))):
        eset = np.seterr(all='ignore')
        mags = 10.0**np.floor(np.log10(np.abs(vals)))  # omag's
        vals = np.around(vals/mags,n)*mags             # round(val/omag)*omag
        np.seterr(**eset)
        vals[np.where(np.isnan(vals))] = 0.0           # 0.0 -> nan -> 0.0
    else:
        raise IOError('Input must be real and finite')
    return vals

Nearest I get to neat does not account for 0.0, nan, inf or complex:

>>> omag      = lambda x: 10**np.floor(np.log10(np.abs(x)))
>>> signifFig = lambda x, n: (np.around(x/omag(x),n)*omag(x))

giving:

>>> m = [0.0, -1.2366e22, 1.2544444e-15, 0.001222]
>>> signifFig(m,2)
array([ nan, -1.24e+22,   1.25e-15,   1.22e-03])

Upvotes: 2

Aditya Sinha
Aditya Sinha

Reputation: 1

I got quite frustrated after scouring the internet and not finding an answer for this, so I wrote my own piece of code. Hope this is what you're looking for

import numpy as np
from numpy import ma

exp = np.floor(ma.log10(abs(X)).filled(0))
ans = np.round(X*10**-exp, sigfigs-1) * 10**exp

Just plug in your np array X and the required number of significant figures. Cheers!

Upvotes: 0

Autumn
Autumn

Reputation: 3766

Most of the solutions given here either (a) don't give correct significant figures, or (b) are unnecessarily complex.

If your goal is display formatting, then numpy.format_float_positional supports the desired behaviour directly. The following fragment returns the float x formatted to 4 significant figures, with scientific notation suppressed.

import numpy as np
x=12345.6
np.format_float_positional(x, precision=4, unique=False, fractional=False, trim='k')
> 12340.

Upvotes: 15

denizb
denizb

Reputation: 184

There is a simple solution that uses the logic built into pythons string formatting system:

def round_sig(f, p):
    return float(('%.' + str(p) + 'e') % f)

Test with the following example:

for f in [0.01, 0.1, 1, 10, 100, 1000, 1000]:
    f *= 1.23456789
    print('%e --> %f' % (f, round_sig(f,3)))

which yields:

1.234568e-02 --> 0.012350
1.234568e-01 --> 0.123500
1.234568e+00 --> 1.235000
1.234568e+01 --> 12.350000
1.234568e+02 --> 123.500000
1.234568e+03 --> 1235.000000
1.234568e+03 --> 1235.000000

Best of luck!

(If you like lambdas use:

round_sig = lambda f,p: float(('%.' + str(p) + 'e') % f)

)

Upvotes: 0

Greg
Greg

Reputation: 12234

From the example numbers you have I think you mean significant figures rather than decimal places (-1.2366e22 to 0 decimal places is still -1.2366e22).

This piece of code works for me, I've always thought there should be an inbuilt function though:

def Round_To_n(x, n):
    return round(x, -int(np.floor(np.sign(x) * np.log10(abs(x)))) + n)

>>> Round_To_n(1.2544444e-15,2)
1.25e-15

>>> Round_To_n(2.128282321e3, 6)
2130.0

Upvotes: 3

Andrew Walker
Andrew Walker

Reputation: 42500

Is numpy.set_printoptions what you're looking for?

import numpy as np
np.set_printoptions(precision=2)
print np.array([  0.0, -1.2366e22, 1.2544444e-15, 0.001222 ])

Gives:

[  0.00e+00  -1.24e+22   1.25e-15   1.22e-03]

Edit:

numpy.around appears to solve aspects of this problem if you're trying to transform the data. However, it doesn't do what you want in cases where the exponent is negative.

Upvotes: 10

Related Questions