Mark_Anderson
Mark_Anderson

Reputation: 1324

Insert and delete from dictionary in a for loop -- optimal method?

I have a dict that looks like this:

{attribute_1 : True,
 attribute_2 : False,
 attribute_3 : 'foo', # Can be one of multiple text options here
 attribute_4 : 5,}    # Can be one of multiple numerical options here

I need to convert it so that every value is a boolean, thus giving:

{attribute_1 : True,
 attribute_2 : False,
 attribute_3_foo : True,
 attribute_4_5 : True}

(one-hot encoding for machine learning, in case anyone cares why I'm doing such an odd thing. Will process many, many such dictionaries...).

A working solution I have found is to do a for loop through the dict to hunt for non-boolean vals and (1) create new entries, then (2) delete anything with non-boolean key. That's fine, but it seems inelegant and memory inefficient as my list is a new object in memory. Is there a better way to do this?

# List loop to insert ('k,v in dict' won't let you add/delete items)
for x in list(sub_d.items()):
    if type(x[1]) is not bool:
        sub_d[x[0]+'_'+ str(x[1])] = True
        del sub_d[x[0]]

PS. List comprehensions don't work, as I can't find a way to feed in a sufficiently complex operation to do the work. Plus I don't think they would have any efficiency gains over the my current solution?

Upvotes: 1

Views: 72

Answers (2)

user2390182
user2390182

Reputation: 73480

You can use a dict comprehension:

d = {k if isinstance(v, bool) else '{}_{}'.format(k, v): bool(v) 
     for k, v in d.items()} 

{'attribute_1': True,
 'attribute_2': False,
 'attribute_3_foo': True,
 'attribute_4_5': True}

Upvotes: 1

T.Woody
T.Woody

Reputation: 1228

List loop to insert ('k,v in dict' won't let you add/delete items)

for x in list(sub_d.items()):

   if type(x[1]) is not bool:

       sub_d[x[0]+'_'+ str(x[1])] = True

       del sub_d[x[0]]

Why not just:

for x in dic:
  if type(x) is not bool:
    dic[x] = True

There is no reason to delete the entries, and this will run in O(n) time, as a dic is a hashtable.

Upvotes: 0

Related Questions