rodrigocf
rodrigocf

Reputation: 2099

replace information in Json string based on a condition

I have a very large json file with several nested keys. From whaat I've read so far, if you do:

x = json.loads(data)

Python will interpret it as a dictionary (correct me if I'm wrong). The fourth level of nesting in the json file contains several elements named by an ID number and all of them contain an element called children, something like this:

{"level1":
    {"level2":
        {"level3":
            {"ID1":
                {"children": [1,2,3,4,5]}
            }
            {"ID2":
                {"children": []}
            }
            {"ID3":
                {"children": [6,7,8,9,10]}
            }
      }
   }
}

What I need to do is to replace all items in all the "children" elements with nothing, meaning "children": [] if the ID number is in a list called new_ids and then convert it back to json. I've been reading on the subject for a few hours now but I haven't found anything similar to this to try to help myself.

I'm running Python 3.3.3. Any ideas are greatly appreciated!!

Thanks!!

EDIT

List:

new_ids=["ID1","ID3"]

Expected result:

{"level1":
    {"level2":
        {"level3":
            {"ID1":
                {"children": []}
            }
            {"ID2":
                {"children": []}
            }
            {"ID3":
                {"children": []}
            }
      }
   }
}

Upvotes: 1

Views: 3031

Answers (2)

furas
furas

Reputation: 142651

If you have simple dictionary like this

data_dict = {
    "level1": {
        "level2":{
            "level3":{
                "ID1":{"children": [1,2,3,4,5]},
                "ID2":{"children": [] },
                "ID3":{"children": [6,7,8,9,10]},
            }
        }
    }
}

than you need only this:

data_dict = {
    "level1": {
        "level2":{
            "level3":{
                "ID1":{"children": [1,2,3,4,5]},
                "ID2":{"children": [] },
                "ID3":{"children": [6,7,8,9,10]},
            }
        }
    }
}

new_ids=["ID1","ID3"]

for idx in new_ids:
    if idx in data_dict['level1']["level2"]["level3"]:
        data_dict['level1']["level2"]["level3"][idx]['children'] = []

print data_dict

'''    
{
    'level1': {
        'level2': {
            'level3': {
                'ID2': {'children': []}, 
                'ID3': {'children': []}, 
                'ID1': {'children': []}
             }
        }
    }
}
'''

but if you have more complicated dictionary

data_dict = {
    "level1a": {
        "level2a":{
            "level3a":{
                "ID2":{"children": [] },
                "ID3":{"children": [6,7,8,9,10]},
            }
        }
    },
    "level1b": {
        "level2b":{
            "level3b":{
                "ID1":{"children": [1,2,3,4,5]},
            }
        }
    }
}

new_ids =["ID1","ID3"]

for level1 in data_dict.values():
    for level2 in level1.values():
           for level3 in level2.values():
              for idx in new_ids:
                if idx in level3:
                    level3[idx]['children'] = []

print data_dict

'''
{
    'level1a': {
        'level2a': {
            'level3a': {
                'ID2': {'children': []}, 
                'ID3': {'children': []}
            }
        }
    },
    'level1b': {
        'level2b': {
            'level3b': {
                'ID1': {'children': []}
            }
        }
    }
} 
'''

Upvotes: 0

timgeb
timgeb

Reputation: 78690

First of all, your JSON is invalid. I assume you want this:

{"level1":
    {"level2":
        {"level3":
            {
            "ID1":{"children": [1,2,3,4,5]},
            "ID2":{"children": []},
            "ID3":{"children": [6,7,8,9,10]}
            }
        }
    }
}

Now, load your data as a dictionary:

>>> with open('file', 'r') as f:
...     x = json.load(f)
... 
>>> x
{u'level1': {u'level2': {u'level3': {u'ID2': {u'children': []}, u'ID3': {u'children': [6, 7, 8, 9, 10]}, u'ID1': {u'children': [1, 2, 3, 4, 5]}}}}}

Now you can loop over the keys in x['level1']['level2']['level3'] and check whether they are in your new_ids.

>>> new_ids=["ID1","ID3"]
>>> for key in x['level1']['level2']['level3']:
...     if key in new_ids:
...         x['level1']['level2']['level3'][key]['children'] = []
... 
>>> x
{u'level1': {u'level2': {u'level3': {u'ID2': {u'children': []}, u'ID3': {u'children': []}, u'ID1': {u'children': []}}}}}

You can now write x back to a file like this:

with open('myfile', 'w') as f:
    f.write(json.dumps(x))

If your new_ids list is large, consider making it a set.

Upvotes: 1

Related Questions