Reputation: 95
I have two computers(personal and work): I am making a call from an REST API for the same ndjson data. Both of these computers have access to the API, and I use the same code.
On my personal computer, I am using the following code to flatten the nested ndjson file.
def flatten_json(nested_json, exclude=['']):
"""Flatten json object with nested keys into a single level.
Args:
nested_json: A nested json object.
exclude: Keys to exclude from output.
Returns:
The flattened json object if successful, None otherwise.
"""
out = {}
def flatten(x, name='', exclude=exclude):
if type(x) is dict:
for a in x:
if a not in exclude: flatten(x[a], name + a + '_')
elif type(x) is list:
i = 0
for a in x:
flatten(a, name + str(i) + '_')
i += 1
else:
out[name[:-1]] = x
flatten(nested_json)
return out
Finally, I call the function to flatten the ndjson file
import pandas as pd
from io import StringIO
import ndjson
import json
items = response.json(cls=ndjson.Decoder)
df = pd.json_normalize(items)
d = {}
for cols in df:
d[cols] = pd.DataFrame([flatten_json(x) for x in df[cols]])
print(d)
On my personal computer, the code works exactly how it should by flattening the ndjson data file returning a dictionary of DataFrames.
However, when I copy and paste this code to my work computer's Jupyter Notebook. The ndjson file is only partially flattened. Could this be a version issue? Or does anyone have any advice to resolve this issue?
Upvotes: 1
Views: 788
Reputation: 666
There may be different Python libraries, or different versions of them, between the two environments.
One good way to track your Python environment and dependencies from Jupyter is to use the Watermark
plugin https://github.com/rasbt/watermark which will list versions.
Upvotes: 2