Reputation: 1233
I have a json file of size less than 1Gb.I am trying to read the file on a server that have 400 Gb RAM using the following simple command:
df = pd.read_json('filepath.json')
However this code is taking forever (several hours) to execute,I tried several suggestions such as
df = pd.read_json('filepath.json', low_memory=False)
or
df = pd.read_json('filepath.json', lines=True)
But none have worked. How come reading 1GB file into a server of 400GB be so slow?
Upvotes: 0
Views: 2096
Reputation: 291
You can use Chunking can shrink memory use. I recommend Dask Library can load data in parallel.
Upvotes: 1