Lonewolf
Lonewolf

Reputation: 197

Trouble with importing a file with pandas read_json

I'm new to Python (I'm using python 3) and I'm trying to import a JSON file in Jupyter notebook. However, it is giving the error below:

UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 4276350: character maps to <undefined> 

Below is the code:

import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib as plt
import json
%matplotlib inline

with open('C:\\Users/Desktop/Machine  Learning/yelp_academic_dataset_business.json') as datafile:
data = pd.read_json(datafile,orient='columns',encoding='utf-8')
dataframe = pd.DataFrame(data)

I would appreciate any help.

Upvotes: 2

Views: 2775

Answers (1)

cody
cody

Reputation: 11157

Assuming this is the file you're trying to import, it is actually many JSON objects, one per line. You need to import it line by line by specifying lines=True:

data = pd.read_json(datafile, lines=True, orient='columns', encoding='utf-8')

Also, pass the file path as the first argument, not the file contents. You can get rid of the code for opening the file. Furthermore, pd.read_json is returning a DataFrame, there is no need for that last line of your program:

>>> data = pd.read_json('yelp_academic_dataset_business.json', lines=True, orient='columns', encoding='utf-8')
>>> data
                                              attributes             business_id                                         categories             city    ...    review_count stars  state      type
0      {'Take-out': False, 'Wi-Fi': 'free', 'Good For...  O_X3PGhk3Y5JWVi866qlJg  [Active Life, Arts & Entertainment, Stadiums &...          Phoenix    ...              29   4.0     AZ  business
1      {'Parking': {'garage': False, 'street': False,...  QbrM7wqtmoNncqjc6GtFaQ  [Tires, Automotive, Fashion, Shopping, Departm...         Glendale    ...               3   3.5     AZ  business

Upvotes: 2

Related Questions