Rworldproblems
Rworldproblems

Reputation: 83

Why do I get 'unhashable type: dict' error when recursively cleaning json object?

I am trying to clean a json object by removing keys if their value is 'N/A', '-', or '' and likewise removing any of these values from any lists. Example of object to be cleaned:

dirty = {
    'name': {'first': 'Robert', 'middle': '', 'last': 'Smith'},
    'age': 25,
    'DOB': '-',
    'hobbies': ['running', 'coding', '-'],
    'education': {'highschool': 'N/A', 'college': 'Yale'}
}

I found a similar problem and modified the solution, giving this function:

def clean_data(value):
    """
    Recursively remove all values of 'N/A', '-', and '' 
    from dictionaries and lists, and return
    the result as a new dictionary or list.
    """
    missing_indicators = set(['N/A', '-', ''])
    if isinstance(value, list):
        return [clean_data(x) for x in value if x not in missing_indicators]
    elif isinstance(value, dict):
        return {
            key: clean_data(val)
            for key, val in value.items()
            if val not in missing_indicators
        }
    else:
        return value

But I get the unhashable type: dict error from the dictionary comprehension:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-79-d42b5f1acaff> in <module>
----> 1 clean_data(dirty)

<ipython-input-72-dde33dbf1804> in clean_data(value)
     11         return {
     12             key: clean_data(val)
---> 13             for key, val in value.items()
     14             if val not in missing_indicators
     15         }

<ipython-input-72-dde33dbf1804> in <dictcomp>(.0)
     12             key: clean_data(val)
     13             for key, val in value.items()
---> 14             if val not in missing_indicators
     15         }
     16     else:

TypeError: unhashable type: 'dict'

Obviously something about the way I do the set comparison doesn't work the way I think it should when val is a dict. Can anyone enlighten me?

Upvotes: 0

Views: 947

Answers (1)

erewok
erewok

Reputation: 7835

At first glance, this looks like a problem:

if val not in missing_indicators

When you use in on a set, it will check if the value you're asking about is among the set entries. To be a key in a dict or a member of a set in Python, the value you're using must be hashable. You can check if a value in Python is hashable by running hash on it:

>>> hash(1)
1
>>> hash("hello")
7917781502247088526
>>> hash({"1":"2"})
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'dict'

In your snippet, it looks like val is a dict and you are asking Python if this val is one of the values present in the set. In response, Python attempts to hash val, but this fails.

The hurdle you have to overcome is that some of the values in your outer dict are themselves a dict, whereas other values look like list, str or int. You will need different strategies in each case: check what type of thing val is and then act accordingly.

Upvotes: 1

Related Questions