jdag
jdag

Reputation: 149

convert csv to json (nested objects)

I am new to python, and I am having to convert a csv file to json in following format:

CSV File :

firstname, lastname, email, customerid, dateadded, customerstatus
john, doe, [email protected], 124,26/11/18,active
jane, doe, [email protected], 125,26/11/18,active

JSON format:

{
    firstname: "John",
    lastname: "Doe",
    emailOrPhone: "[email protected]",
    extraFields: [{
            name: "customerid",
            value: "124"
        },
        {
            name: "dateadded",
            value: "26/11/18"
        },
        {
            name: "dateadded",
            value: "26/11/18"
        }
    ]
}, {
    firstname: "Jane",
    lastname: "Doe",
    emailOrPhone: "[email protected]",
    extraFields: [{
            name: "customerid",
            value: "125"
        },
        {
            name: "dateadded",
            value: "26/11/18"
        },
        {
            name: "dateadded",
            value: "26/11/18"
        }
    ]
}


current code I am using:
import requests
import json
import time
import csv
import json
import glob
import os
import logging


for filename in glob.glob('D:\\api\\Extract.csv'):
    csvfile = os.path.splitext(filename)[0]
    jsonfile = csvfile + '.json'

    with open(csvfile+'.csv') as f:
        reader = csv.DictReader(f)
        rows = list(reader)

    with open(jsonfile, 'w') as f:
        json.dump(rows, f)

url = 'api_url'

with open("D:\\api\\Extract.json", "r") as read_file:
    data = json.load(read_file)

    for item in data:


        headers = {"Authorization" : "key", "Content-Type" : "application/json"}

        r = requests.post(url, data= json.dumps(item), headers= headers)



        logging.basicConfig(level=logging.DEBUG,
                    format='%(asctime)s %(message)s',
                    handlers=[logging.FileHandler("D:\\api\\log_file.log"),
                              logging.StreamHandler()])

I can produce parent values in json, but I am not sure how do I get sub-nodes and parse column name as values and iterate through entire file like that. Above code converts csv to simple json objects, I want to achieve nested objects. I am thinking maybe appending would be the solution, but not sure how to pass column as value and corresponding data as value.

Upvotes: 6

Views: 6021

Answers (3)

Snivio
Snivio

Reputation: 1864

import csv
import sys
import json

#EDIT THIS LIST WITH YOUR REQUIRED JSON KEY NAMES
fieldnames=["firstname","secondname","age"]

def convert(filename):
  csv_filename = filename[0]
  print "Opening CSV file: ",csv_filename 
  f=open(csv_filename, 'r')
  csv_reader = csv.DictReader(f,fieldnames)
  json_filename = csv_filename.split(".")[0]+".json"
  print "Saving JSON to file: ",json_filename
  jsonf = open(json_filename,'w') 
  data = json.dumps([r for r in csv_reader])
  jsonf.write(data) 
  f.close()
  jsonf.close()

if __name__=="__main__":
  convert(sys.argv[1:])

USAGE:

python csv2json.py myCSVfile.txt

where myCSVfile.txt it's your CSV file (name it as you prefer).

It will create a JSON array in a file named myCSVfile.json

That's all.

Upvotes: 0

RoadRunner
RoadRunner

Reputation: 26315

At little more complicated than needs to be, but you can try building your JSON array as you read in your values from the csv file, then output your result to a .json file with json.dump at the end:

from csv import reader
from json import dump

top_fields = ["firstname", "lastname", "email"]
extra_fields = ["customerid", "dateadded", "customerstatus"]

data = []
with open("customers.csv") as csv_in:
    csv_reader = reader(csv_in)

    # Get headers
    headers = list(map(str.strip, next(csv_reader)))

    for row in csv_reader:
        json_object = {}

        # Build dictionary for each row
        row_map = dict(zip(headers, map(str.strip, row)))

        # Add in top fields first
        for top in top_fields:
            json_object[top] = row_map[top]

        # Then add in extra fields
        for extra in extra_fields:
            json_object.setdefault("extraFields", []).append(
                {"name": extra, "value": row_map[extra]}
            )

        data.append(json_object)

with open("customers.json", "w") as fp:
    dump(data, fp, indent=4, sort_keys=True)

Which gives the following customers.json:

[
    {
        "email": "[email protected]",
        "extraFields": [
            {
                "name": "customerid",
                "value": "124"
            },
            {
                "name": "dateadded",
                "value": "26/11/18"
            },
            {
                "name": "customerstatus",
                "value": "active"
            }
        ],
        "firstname": "john",
        "lastname": "doe"
    },
    {
        "email": "[email protected]",
        "extraFields": [
            {
                "name": "customerid",
                "value": "125"
            },
            {
                "name": "dateadded",
                "value": "26/11/18"
            },
            {
                "name": "customerstatus",
                "value": "active"
            }
        ],
        "firstname": "jane",
        "lastname": "doe"
    }
]

Upvotes: 3

slider
slider

Reputation: 12990

You can use csv.DictReader which gives you access to the column name as you're iterating each row. Then you can build each item as follows:

import json
import csv

primary_fields = ['firstname', 'lastname', 'email']
result = []
with open('mydata.csv') as csv_file:
    reader = csv.DictReader(csv_file, skipinitialspace=True)
    for row in reader:
        d = {k: v for k, v in row.items() if k in primary_fields}
        d['extraFields'] = [{'name': k, 'value': v} for k, v in row.items() if k not in primary_fields]
        result.append(d)

print(json.dumps(result, indent=2))

Output

[
  {
    "firstname": "john",
    "lastname": "doe",
    "email": "[email protected]",
    "extraFields": [
      {
        "name": "customerid",
        "value": "124"
      },
      {
        "name": "dateadded",
        "value": "26/11/18"
      },
      {
        "name": "customerstatus",
        "value": "active"
      }
    ]
  },
  {
    "firstname": "jane",
    "lastname": "doe",
    "email": "[email protected]",
    "extraFields": [
      {
        "name": "customerid",
        "value": "125"
      },
      {
        "name": "dateadded",
        "value": "26/11/18"
      },
      {
        "name": "customerstatus",
        "value": "active"
      }
    ]
  }
]

If you want to set custom field names in your final json (e.g. emailOrPhone for email), you can always manually set field names for d and set the appropriate value

Upvotes: 6

Related Questions