Reputation: 197
I'm new to Python (I'm using python 3) and I'm trying to import a JSON file in Jupyter notebook. However, it is giving the error below:
UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 4276350: character maps to <undefined>
Below is the code:
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib as plt
import json
%matplotlib inline
with open('C:\\Users/Desktop/Machine Learning/yelp_academic_dataset_business.json') as datafile:
data = pd.read_json(datafile,orient='columns',encoding='utf-8')
dataframe = pd.DataFrame(data)
I would appreciate any help.
Upvotes: 2
Views: 2775
Reputation: 11157
Assuming this is the file you're trying to import, it is actually many JSON objects, one per line. You need to import it line by line by specifying lines=True
:
data = pd.read_json(datafile, lines=True, orient='columns', encoding='utf-8')
Also, pass the file path as the first argument, not the file contents. You can get rid of the code for opening the file. Furthermore, pd.read_json
is returning a DataFrame, there is no need for that last line of your program:
>>> data = pd.read_json('yelp_academic_dataset_business.json', lines=True, orient='columns', encoding='utf-8')
>>> data
attributes business_id categories city ... review_count stars state type
0 {'Take-out': False, 'Wi-Fi': 'free', 'Good For... O_X3PGhk3Y5JWVi866qlJg [Active Life, Arts & Entertainment, Stadiums &... Phoenix ... 29 4.0 AZ business
1 {'Parking': {'garage': False, 'street': False,... QbrM7wqtmoNncqjc6GtFaQ [Tires, Automotive, Fashion, Shopping, Departm... Glendale ... 3 3.5 AZ business
Upvotes: 2