user3754136
user3754136

Reputation: 529

How to convert json to csv python with proper headers

Trying to get Json data to csv i am getting the values but one block is showing as one line in result, new to python so any help appriciated. Have tried the below code to do the same.

import pandas as pd

with open(r'C:\Users\anath\hard.json', encoding='utf-8') as inputfile:
    df = pd.read_json(inputfile)

    df.to_csv(r'C:\Users\anath\csvfile.csv', encoding='utf-8', index=True)

Sample Json in the source file, short snippet

{
  "issues": [
    {
      "issueId": 110052,
      "revision": 84,
      "definitionId": "DNS1012",
      "subject": "urn:h:domain:fitestdea.com",
      "subjectDomain": "fitestdea.com",
      "title": "Nameserver name doesn\u0027t resolve to an IPv6 address",
      "category": "DNS",
      "severity": "low",
      "cause": "urn:h:domain:ns1.gname.net",
      "causeDomain": "ns1.gname.net",
      "open": true,
      "status": "active",
      "auto": true,
      "autoOpen": true,
      "createdOn": "2022-09-01T02:29:09.681451Z",
      "lastUpdated": "2022-11-23T02:26:28.785601Z",
      "lastChecked": "2022-11-23T02:26:28.785601Z",
      "lastConfirmed": "2022-11-23T02:26:28.785601Z",
      "details": "{}"
    },
    {
      "issueId": 77881,
      "revision": 106,
      "definitionId": "DNS2001",
      "subject": "urn:h:domain:origin-mx.stagetest.test.com.test.com",
      "subjectDomain": "origin-mx.stagetest.test.com.test.com",
      "title": "Dangling domain alias (CNAME)",
      "category": "DNS",
      "severity": "high",
      "cause": "urn:h:domain:origin-www.stagetest.test.com.test.com",
      "causeDomain": "origin-www.stagetest.test.com.test.com",
      "open": true,
      "status": "active",
      "auto": true,
      "autoOpen": true,
      "createdOn": "2022-08-10T09:34:36.929071Z",
      "lastUpdated": "2022-11-23T09:33:32.553663Z",
      "lastChecked": "2022-11-23T09:33:32.553663Z",
      "lastConfirmed": "2022-11-23T09:33:32.553663Z",
      "details": "{\"@type\": \"hardenize/com.hardenize.schemas.dns.DanglingProblem\", \"rrType\": \"CNAME\", \"rrDomain\": \"origin-mx.stagetest.test.com.test.com\", \"causeDomain\": \"origin-www.stagetest.test.com.test.com\", \"danglingType\": \"nxdomain\", \"rrEffectiveDomain\": \"origin-mx.stagetest.test.com.test.com\"}"
    }
    }
  ]
}

Output i am getting is as below was looking a way where could field name in header and values in a column or cell so far getting the entire record in 1 cell. Any way we can just get specific field only like title, severity or issueid not everything but only the feilds i need. enter image description here

Upvotes: 1

Views: 703

Answers (2)

Andrej Kesely
Andrej Kesely

Reputation: 195438

Try:

import json
import pandas as pd

with open("your_file.json", "r") as f_in:
    data = json.load(f_in)

df = pd.DataFrame(data["issues"])
print(df[["title", "severity", "issueId"]])

Prints:

                                                title severity  issueId
0  Nameserver name doesn't resolve to an IPv6 address      low   110052
1                       Dangling domain alias (CNAME)     high    77881

To save as CSV you can do:

df[["title", "severity", "issueId"]].to_csv('data.csv', index=False)

Upvotes: 1

user3754136
user3754136

Reputation: 529

Finally this worked for me @Andrej Kesely thanks for the inputs. sharing as might help others.

import pandas as pd
import json

with open(r'C:\Users\anath\hard.json', encoding='utf-8') as inputfile:
    data = json.load(inputfile)


df = pd.DataFrame(data["issues"])
print(df[["title", "severity", "issueId"]])
df[["title", "severity", "issueId"]].to_csv('data.csv', index=False)

Upvotes: 0

Related Questions