Jaffer Wilson
Jaffer Wilson

Reputation: 7273

how can I reliably access a single key-value pair from a JSON file that's too large to load into memory?

I am trying to retrieve the names of the people from my file. The file size is 201GB

import json

with open("D:/dns.json", "r") as fh:
    for l in fh:
        d = json.loads(l)
        print(d["name"])

Whenever I try to run this program on windows, I encounter a Memory error, which says insufficient memory.

Is there a reliable way to parse a single key, value pair without loading the whole file? I have reading the file in chunks in mind, but I don't know how to start.

Here is sample: test.json

Every line is seperated by newline. Hope this helps.

Upvotes: 3

Views: 84

Answers (2)

bruno desthuilliers
bruno desthuilliers

Reputation: 77912

You may want to give ijson a try : https://pypi.python.org/pypi/ijson

Upvotes: 1

holdenweb
holdenweb

Reputation: 37043

Unfortunately there is no guarantee that each line of a JSON file will make any sense to the parser on its own. I'm afraid JSON was never intended for multi-gigabyte data exchange, precisely because each JSON file contains an integral data structure. In the XML world people have written incremental event-driven (SAX-based) parsers. I'm not aware of such a library for JSON.

Upvotes: 0

Related Questions