Raf A
Raf A

Reputation: 179

How can I use Python to generate nested JSON data from my CSV file

I have tried to use the online Jsonify It tool which can create nested JSON data from my data but I can't seem to get that to work. I have also tried to use the Python code from other posts on but they do not seem to work either. If you know an easier method than using Python, that would be good.

Here is my .CSV data:

ID,Name,Date,Subject,Start,Finish
0,Ladybridge High School,01/11/2019,Maths,05:28,0
0,Ladybridge High School,02/11/2019,Maths,05:30,06:45
0,Ladybridge High School,01/11/2019,Economics,11:58,12:40
0,Ladybridge High School,02/11/2019,Economics,11:58,12:40
1,Loreto Sixth Form,01/11/2019,Maths,05:28,06:45
1,Loreto Sixth Form,02/11/2019,Maths,05:30,06:45
1,Loreto Sixth Form,01/11/2019,Economics,11:58,12:40
1,Loreto Sixth Form,02/11/2019,Economics,11:58,12:40

This is the nested JSON structure I would like:

{
  "Timetable" : [ {
    "Date" : {
      "01-11-2019" : {
    "Maths" : {
      "Start" : "05:28",
      "Finish" : "06:45"
    },
    "Economics" : {
      "Start" : "11:58",
      "Finish" : "12:40"
    }
      },
      "02-11-2019" : {
    "Maths" : {
      "Start" : "05:30",
      "Finish" : "06:45"
    },
    "Economics" : {
      "Start" : "11:58",
      "Finish" : "12:40"
    }
      }
    },
    "Name" : "Ladybridge High School"
  }, {
    "Date" : {
      "01-11-2019" : {
    "Maths" : {
      "Start" : "05:28",
      "Finish" : "06:45"
    },
    "Economics" : {
      "Start" : "11:58",
      "Finish" : "12:40"
    }
      },
      "02-11-2019" : {
    "Maths" : {
      "Start" : "05:30",
      "Finish" : "06:45"
    },
    "Economics" : {
      "Start" : "11:58",
      "Finish" : "12:40"
    }
      }
    },
    "Name" : "Loreto Sixth From"
  } ]
}

Upvotes: 1

Views: 269

Answers (2)

adamkgray
adamkgray

Reputation: 1937

Something like this?

[EDIT]

I refactored it to handle arbitrary top-level keys for each entry in the timetable. I also made it first create a dict and then convert the dict to a list so that it can run in O(N) time, in case the input is very large.

import csv
timetable = {}
with open('data.csv') as f:
    csv_data = [{k: v for k, v in row.items()} for row in csv.DictReader(f, skipinitialspace=True)]
for row in csv_data:
    if not timetable.get(row["ID"]):
        timetable[row["ID"]] = {"ID": row["ID"], "Date": {}}
    for k in row.keys():
        # Date has to be handled as a special case
        if k == "Date":
            timetable[row["ID"]]["Date"][row["Date"]] = {}
            timetable[row["ID"]]["Date"][row["Date"]][row["Subject"]] = {
                "Start": row["Start"],
                "Finish": row["Finish"]
            }
        # Ignore these keys because they are only for 'Date'
        elif k == "Start" or k == "Finish" or k == "Subject":
            continue
        # Use everything else
        else:
            timetable[row["ID"]][k] = row[k]
timetable = {"Timetable": [v for k, v in timetable.items()]}

Upvotes: 2

Raf A
Raf A

Reputation: 179

An improvement to the above answer to nest the ID before the name and date:

import csv timetable = {"Timetable": []} print(timetable) with open("C:/Users/kspv914/Downloads/data.csv") as f: csv_data = [{k: v for k, v in row.items()} for row in csv.DictReader(f, skipinitialspace=True)] name_array = [] for name in [row["Name"] for row in csv_data]: name_array.append(name) name_set = set(name_array) for name in name_set: timetable["Timetable"].append({"Name": name, "Date": {}}) for row in csv_data: for entry in timetable["Timetable"]: if entry["Name"] == row["Name"]: entry["Date"][row["Date"]] = {} entry["Date"][row["Date"]][row["Subject"]] = { "Start": row["Start"], "Finish": row["Finish"] } print(timetable)

Upvotes: 0

Related Questions