Vasanth Raghavan
Vasanth Raghavan

Reputation: 162

modifying json - deleting certain elements within a json structure using python

My json structure is as follows :

"AGENT": {
    "pending": [],
    "active": null,
    "completed": [
        **{
            "result": {
                "job1.AGENT": "SUCCESS",
                "job2.AGENT": "SUCCESS"
            },
            "return_value": {
                "job1.AGENT": "",
                "job2.AGENT": ""
            },
            "visible": true,
            "global": true,
            "locale": [
                "en_US"
            ],
            "complete_time": "2018-01-24T17:44:33.484Z",
            "persist": true,
            "type": "script",
            "script": "<script_name>.py",
            "preset_status": "CONFIGURING",
             "parameters": {},
            "submit_time": "2018-01-24T17:44:26.747Z"
        }**,
        {
            "result": {
                ..
            },
            "return_value": {
                ..
            },
            "visible": true,
            "global": true,
            "locale": [
                "en_US"
            ],
            "complete_time": "2018-04-2T17:44:40.049Z",
             "submit_time": "2018-04-2T17:44:26.817Z"
        }

I need to delete the entire result block based on complete_time, like delete the result block before 2018-04-03

How can i acheive this in python ?

I have tried the following so far :

     json_data = json.dumps(data) 
     item_dict = json.loads(data)
     print item_dict["AGENT"]["completed"][0]["complete_time"]

This prints the complete time. However my problem is "AGENT" is not a constant string. The string can vary. Also I will need to figure out the logic to remove the entire json block based on complete_time

Upvotes: 0

Views: 60

Answers (1)

Serge Ballesta
Serge Ballesta

Reputation: 148890

Ok, I assume that you were able to correctly load the json into a Python dictionnary, let call it item_dict, but the key may vary.

What you need now it to walk inside that Python object, and decode the complete_time field. Unfortunately, Python strptime does not know about the Z time zone, so we will have to skip that last character.

Additionaly, you should never modify a collection object while iterating it, so the bullet proof way is to store indices to remove and later remove them. Code could be:

datelimit = datetime.datetime(2018, 4, 1)       # limit date for completed_time
to_remove = []
dateformat = '%Y-%m-%dT%H:%M:%S.%f'
for k, v in item_dict.items():                  # enumerate top_level objects
    for i, block in enumerate(v['completed']):  # enumerate inner blocks
        complete_time = datetime.datetime.strptime(   # skip last char from complete_time
            block["complete_time"][:-1], dateformat)
        # print(k, i, complete_time)              # uncomment for tests
        if complete_time < datelimit:           # too old
            to_remove.append((k, i))            # store the index for later processing

for k, i in reversed(to_remove):           # start from the end to keep consistent indices
    del item_dict[k]["completed"][i]       # actual deletion

Upvotes: 1

Related Questions