tzippy
tzippy

Reputation: 6638

Diff data in two jsons and persist the differences in python

I am planning to pull data from a webservice in specified intervals (2 times a day) using a python script. After getting the data as json I would like to compare it to the data I pulled the last time, then make the differences available as json data, store the pulled json for the next comparison and delete the previous data.

The diff json will always have the same schema, similar to this example:

 [
  {"Title" : "Class1", "ID" : "155e5acc-aaa7e-2872ade", "Time" : "2020-10-15 08:15:00", "Participants": 0, "Cancelled": 0, "Booked": 0, "Full":0 },
  {"Title" : "Class3",  "ID" : "235e5a3b-b890e-3200ad9", "Time" : "2020-10-15 11:00:00","Participants": 11, "Cancelled": 0, "Booked": 1, "Full":0 },
  {"Title" : "Class0",  "ID" : "985e5a4a-3cd7e-ac87a76","Time" : "2020-10-15 15:30:00","Participants": 5, "Cancelled": 1, "Booked": 0, "Full":0 }
 ]

Each item in the array can be identified by the value ID and compared to the corresponding Item from the previous time I pulled the data. If there's a difference in any of the other fields (Participants, Booked, Full, Time) I want to store the diff in some kind of way that makes it clear what field of what item has changed. This change of data could have its own schema where each element can be identified via an ID tag as well. For example, if the Time and Booked for "Class1" has changed, this would be in the diff as an array of items with the previous data in was and the new data in is.

    [
      {"ID": "155e5acc-aaa7e-2872ade", "ChangedData" : [
          {"Field": "Time", "Was" : "2020-10-15 08:15:00", "Is": "2020-10-15 09:15:00" },
          {"Field" : "Booked", "Was": 0, "Is" : 1} ]},
    ]

My actual question here is: Is there a best practice or something along the lines? I know what I want, but not sure exactly how to get there. At the moment I would just pull the data, save it to disk and the next time load it again, compare it to the new data, and iterate over each item in both arrays, compare the ID and each data element. But maybe there's already an existing json package for a task like that. Thank you in advance!

Upvotes: 2

Views: 44

Answers (1)

Itachi
Itachi

Reputation: 6070

I think dictdiffer may help you check the differences. Here is a simple official example:

>>> list(diff({'fruits': []}, {'fruits': ['apple', 'mango']}, expand=True))
[('add', 'fruits', [(0, 'apple')]), ('add', 'fruits', [(1, 'mango')])]

Upvotes: 1

Related Questions