How do I fill a dictionary with indices in a for loop?

Question

I have a transposed Dataframe tr:

	7128	8719	14051	14636
JDUTC_0	2451957.36	2452149.36	2457243.98	2452531.89
JDUTC_1	2451957.37	2452149.36	2457243.99	2452531.90
JDUTC_2	2451957.37	2452149.36	2457244.00	2452531.91
JDUTC_3	NaN	2452149.36	NaN	NaN
JDUTC_4	NaN	2452149.36	NaN	NaN
JDUTC_5	NaN	2452149.36	NaN	NaN
JDUTC_6	1.23	2452149.37	NaN	NaN
JDUTC_7	NaN	NaN	NaN	NaN
JDUTC_8	NaN	NaN	NaN	NaN
JDUTC_9	NaN	NaN	NaN	NaN

And I create dict 'a' with this block of code:

a = {}
b=[]
for _, contents in tr.items():
    b.clear()
    for ind, val in enumerate(contents):
        if np.isnan(val):
            b.append(ind)
            continue
        else:
            pass
    print(_)
    print(b)
    a[_] = b
    print(a)

Which gives me this output:

7128
[3, 4, 5, 7, 8, 9]
{7128: [3, 4, 5, 7, 8, 9]}
8719
[7, 8, 9]
{7128: [7, 8, 9], 8719: [7, 8, 9]}
14051
[3, 4, 5, 6, 7, 8, 9]
{7128: [3, 4, 5, 6, 7, 8, 9], 8719: [3, 4, 5, 6, 7, 8, 9], 14051: [3, 4, 5, 6, 7, 8, 9]}
14636
[3, 4, 5, 6, 7, 8, 9]
{7128: [3, 4, 5, 6, 7, 8, 9], 8719: [3, 4, 5, 6, 7, 8, 9], 14051: [3, 4, 5, 6, 7, 8, 9], 
14636: [3, 4, 5, 6, 7, 8, 9]}

What I expect dict 'a' to look like is this:

{7128: [3, 4, 5, 7, 8, 9]
 8719: [7, 8, 9]
14051: [3, 4, 5, 6, 7, 8, 9]
14636: [3, 4, 5, 6, 7, 8, 9]}

What I am doing wrong? Why is a[_] = b overwriting all the previous keys when print(_) is verifying that _ is always the next column label?

Gwang-Jin Kim · Accepted Answer

With the correct name convention, I would change your code after:

import numpy as np
import pandas as pd

import sys
if sys.version_info[0] < 3:
    from StringIO import StringIO
else:
    from io import StringIO

s = StringIO("""idx 7128    8719    14051   14636
JDUTC_0 2451957.36  2452149.36  2457243.98  2452531.89
JDUTC_1 2451957.37  2452149.36  2457243.99  2452531.90
JDUTC_2 2451957.37  2452149.36  2457244.00  2452531.91
JDUTC_3 NaN 2452149.36  NaN NaN
JDUTC_4 NaN 2452149.36  NaN NaN
JDUTC_5 NaN 2452149.36  NaN NaN
JDUTC_6 1.23    2452149.37  NaN NaN
JDUTC_7 NaN NaN NaN NaN
JDUTC_8 NaN NaN NaN NaN
JDUTC_9 NaN NaN NaN NaN""")

tr = pd.read_csv(s, sep="	", index_col=0)

(people should give minimal working code - but often forget to give e.g. the code to build the data frame etc. and the imports)

to:



a = {}
b = []
for name, values in tr.items():
    b.clear() # this is problematic as you know
    for ind, val in enumerate(values):
        if np.isnan(val):
            b.append(ind)
            continue
        else:
            pass
    a[name] = b

continue and pass are not necessary - they just say "go on" with the loop. In Python, you are not forced to give the else branch:

for name, values in tr.items():
    b.clear() # This is still problematic at this state.
    for ind, val in enumerate(values):
        if np.isnan(val):
            b.append(ind)
    a[name] = b

Such collection of data using for-loops are better done with list-comprehensions:

a = {}
for name, values in tr.items():
    b = [ind for ind, val in enumerate(values) if np.isnan(val)]
    a[name] = b
# now the result is already correct!

And finally, you can even build list-comprehensions for dictionaries - making this entire code a one-liner - but a readable one - when one is familiar with list comprehensions:

a = {name: [i for i, x in enumerate(vals) if np.isnan(x)] for name, vals in tr.items()}

You can see the result:

a
# which returns:
{'7128': [3, 4, 5, 7, 8, 9],
 '8719': [7, 8, 9],
 '14051': [3, 4, 5, 6, 7, 8, 9],
 '14636': [3, 4, 5, 6, 7, 8, 9]}

List-comprehensions are going into the direction of Functional Programming (FP). Which exactly deals with the problem of not to apply mutation (like the b.append() or b.clear() methods - because - as you have seen: your case is a demonstration of how easily a bug is generated when using mutation. - and would contribute to the discussion - why FP - while it at the first sight looks brain-unfriendly - is actually the more brain-friendly way to program.

List comprehensions are the Pythonic form of "map" - and if you use a "if" inside list comprehensions - this is the Pythonic equivalent to "filter" which FP people know like a second brain for breathing.

How do I fill a dictionary with indices in a for loop?

Answers (2)

Related Questions