Reputation: 188
I have a directory full of JSON files that I need to extract information from and convert into a Pandas dataframe. My current solution works, but I have a feeling that there is a more elegant way of doing this:
for entry in os.scandir(directory):
if entry.path.endswith(".json"):
with open(entry.path) as f:
data = json.load(f)
...
newline = field1 + ',' + field2 + ',' + ... + ',' + fieldn
output.append(newline)
...
df = pd.read_csv(io.StringIO('\n'.join(output)))
Upvotes: 1
Views: 2407
Reputation: 459
Yes, this can be done better.
import os
import pandas as pd
from glob import glob
all_files = glob(os.path.join(path, "*.json"))
ind_df = (pd.read_json(f) for f in all_files)
df = pd.concat(ind_df, ignore_index=True)
Using generators will save a lot of computation and memory.
Upvotes: 4