Michal B.
Michal B.

Reputation: 111

Python Dict truncates a key

I am playing around with Riemann integrals in python. I have a couple functions:

def myfunc(x, mu, sigma):

    px = np.exp(-(x-mu)**2/(2*sigma**2))
    return px

def get_area(h,mu,sigma):
    x = np.arange(-100,100+h,h)
    return sum([myfunc(xi,mu,sigma)*h for xi in x])

I am trying to explore the impact of variations in mu and sigma, on the area under the function. I do so in the following way:

sigma_range = [0.25,0.5,1,2]
h_range = [2,1,0.1,0.001,0.00001]

result_dict = {}

for sigma in sigma_range:
    sigma_dict = {}
    for h in h_range:
        sigma_dict[str(repr(h))] = sigma_dict.get(str(h), [])
        sigma_dict[str(repr(h))].append(get_area(h,1,sigma))
        result_dict[str(sigma)] = sigma_dict

Upon investigation, one of the sigma values (as a key) is truncated. "0.00001" turns into "1e-05".

result_dict["0.25"]

{'2': [0.0013418505116100474],
 '1': [1.0006709252558303],
 '0.1': [0.6266570686577856],
 '0.001': [0.6266570686547552],
 '1e-05': [0.6266570684587373]}

Which results in another error, when I place it in a pandas DataFrame, the sequence of the keys gets mixed up too

enter image description here

If they were at least in the correct order, I could live with it, as analysis would be simple. However, having to jump from one row to another, makes the process tedious.

Prior to posting, I read around and saw that sometimes getting the repr() of a value works, however that didn't work.

I thought that maybe increasing the column width would help, but that only works for value columns rather than the index (irregardless, the problem occurs when creating the dictionary rather than the DF itself.

Upvotes: 3

Views: 225

Answers (3)

damieen
damieen

Reputation: 11

The float specification: "{:f}" works unless you don't need 7 decimal places

"{:f}".format(0.00000001) will give you "0.000000"

Upvotes: 0

LeoE
LeoE

Reputation: 2083

No need for any formats etc. Just use floats as Keys instead of strings, as suggested in the comments. This code:

result_dict = {}

for sigma in sigma_range:
    sigma_dict = {}
    for h in h_range:
        sigma_dict[h] = sigma_dict.get(h, [])
        sigma_dict[h].append(get_area(h,1,sigma))
        result_dict[sigma] = sigma_dict
df = pd.DataFrame(result_dict)

Produces this dataframe:

                            0.25
0.00001      [0.626657068681351]
0.00100     [0.6266570686580978]
0.10000     [0.6266570686577523]
1.00000     [1.0006709252558303]
2.00000  [0.0013418505116100474]

Upvotes: 0

Nicolás Ozimica
Nicolás Ozimica

Reputation: 9738

In this case you should try with the options you can pass to format.

In particular, the float specification: "{:f}"

for sigma in sigma_range:
    sigma_dict = {}
    for h in h_range:
        sigma_dict["{:f}".format(h)] = sigma_dict.get(str(h), [])
        sigma_dict["{:f}".format(h)].append(get_area(h,1,sigma))
        result_dict[str(sigma)] = sigma_dict

Then:

>>> result_dict["0.25"]
{'2.000000': [0.0013418505116100474],
 '1.000000': [1.0006709252558303],
 '0.100000': [0.6266570686577856],
 '0.001000': [0.6266570686547552],
 '0.000010': [0.6266570684587373]}

Upvotes: 2

Related Questions