Reputation: 1324
I have a dict that looks like this:
{attribute_1 : True,
attribute_2 : False,
attribute_3 : 'foo', # Can be one of multiple text options here
attribute_4 : 5,} # Can be one of multiple numerical options here
I need to convert it so that every value is a boolean, thus giving:
{attribute_1 : True,
attribute_2 : False,
attribute_3_foo : True,
attribute_4_5 : True}
(one-hot encoding for machine learning, in case anyone cares why I'm doing such an odd thing. Will process many, many such dictionaries...).
A working solution I have found is to do a for loop through the dict to hunt for non-boolean vals and (1) create new entries, then (2) delete anything with non-boolean key. That's fine, but it seems inelegant and memory inefficient as my list is a new object in memory. Is there a better way to do this?
# List loop to insert ('k,v in dict' won't let you add/delete items)
for x in list(sub_d.items()):
if type(x[1]) is not bool:
sub_d[x[0]+'_'+ str(x[1])] = True
del sub_d[x[0]]
PS. List comprehensions don't work, as I can't find a way to feed in a sufficiently complex operation to do the work. Plus I don't think they would have any efficiency gains over the my current solution?
Upvotes: 1
Views: 72
Reputation: 73480
You can use a dict
comprehension:
d = {k if isinstance(v, bool) else '{}_{}'.format(k, v): bool(v)
for k, v in d.items()}
{'attribute_1': True,
'attribute_2': False,
'attribute_3_foo': True,
'attribute_4_5': True}
Upvotes: 1
Reputation: 1228
List loop to insert ('k,v in dict' won't let you add/delete items)
for x in list(sub_d.items()): if type(x[1]) is not bool: sub_d[x[0]+'_'+ str(x[1])] = True del sub_d[x[0]]
Why not just:
for x in dic:
if type(x) is not bool:
dic[x] = True
There is no reason to delete the entries, and this will run in O(n) time, as a dic
is a hashtable.
Upvotes: 0