rookie012
rookie012

Reputation: 23

Error when trying to read a JSON file using pandas

I'm using python (pandas) to read a JSON file with raw tweets but i'm getting the following error:

ValueError: Unexpected character found when decoding array value (2)

I would appreciate any help.

EDIT: HERE IS A SAMPLE OF THE JSON

{"created_at":"Sat Nov 16 14:15:52 +0000 2019","id":1195707056365461505,"id_str":"1195707056365461505","text":"Any arsenal red members on here, dm me please...got a couple questions\ud83d\ude05\ud83e\udd14","source":"\u003ca href=\"http://twitter.com/download/iphone\" rel=\"nofollow\"\u003eTwitter for iPhone\u003c/a\u003e","truncated":false,"in_reply_to_status_id":null,"in_reply_to_status_id_str":null,"in_reply_to_user_id":null,"in_reply_to_user_id_str":null,"in_reply_to_screen_name":null,"user":{"id":974846850,"id_str":"974846850","name":"Rico Rodrigo","screen_name":"DatGuyTy_online","location":"Brum","url":null,"description":"Aspiring Accountant x Arsenal enthusiast x Anime addict","translator_type":"none","protected":false,"verified":false,"followers_count":647,"friends_count":901,"listed_count":9,"favourites_count":24989,"statuses_count":24628,"created_at":"Tue Nov 27 22:25:31 +0000 2012","utc_offset":null,"time_zone":null,"geo_enabled":true,"lang":null,"contributors_enabled":false,"is_translator":false,"profile_background_color":"C0DEED","profile_background_image_url":"http://abs.twimg.com/images/themes/theme1/bg.png","profile_background_image_url_https":"https://abs.twimg.com/images/themes/theme1/bg.png","profile_background_tile":false,"profile_link_color":"1DA1F2","profile_sidebar_border_color":"C0DEED","profile_sidebar_fill_color":"DDEEF6","profile_text_color":"333333","profile_use_background_image":true,"profile_image_url":"http://pbs.twimg.com/profile_images/1071377159682514945/Np4nGX5m_normal.jpg","profile_image_url_https":"https://pbs.twimg.com/profile_images/1071377159682514945/Np4nGX5m_normal.jpg","profile_banner_url":"https://pbs.twimg.com/profile_banners/974846850/1554183093","default_profile":true,"default_profile_image":false,"following":null,"follow_request_sent":null,"notifications":null},"geo":null,"coordinates":null,"place":null,"contributors":null,"is_quote_status":false,"quote_count":0,"reply_count":0,"retweet_count":0,"favorite_count":0,"entities":{"hashtags":[],"urls":[],"user_mentions":[],"symbols":[]},"favorited":false,"retweeted":false,"filter_level":"low","lang":"en","timestamp_ms":"1573913752057"}

This is the code i'm using to read the file:

import numpy as np 
import pandas as pd 
import re 
import matplotlib.pyplot as plt 
import json 
import os

tweet_file = 'raw_data.json' 
tweets = pd.read_json(tweet_file, convert_dates=True, lines=True, encoding='utf-8')

Upvotes: 1

Views: 1105

Answers (1)

questionto42
questionto42

Reputation: 9512

I had this error with my own json file, trying it with pandas:

File ~/.local/lib/python3.9/site-packages/pandas/io/json/_json.py:1133, in FrameParser._parse_no_numpy(self)
   1129 orient = self.orient
   1131 if orient == "columns":
   1132     self.obj = DataFrame(
-> 1133         loads(json, precise_float=self.precise_float), dtype=None
   1134     )
   1135 elif orient == "split":
   1136     decoded = {
   1137         str(k): v
   1138         for k, v in loads(json, precise_float=self.precise_float).items()
   1139     }

ValueError: Unexpected character found when decoding array value (1)

I then opened the file in VSCode as a json and checked line 2 column 914 and found that after that column, there was a tab instead of spaces.

To fix this, I regex replaced all tabs with four spaces:

enter image description here

Side remark: I had a json with many hardcoded \n linebreaks and thought that I would have to drop them as well, but these hardcoded \n do not harm, you can keep them.

You may find other red markers in the JSON view of VSCode or some other JSON editor. I also ran into a "JSONDecodeError" error, see both fixes in one go at How can I fix the error "JSONDecodeError: Expecting value: ..." when loading a json file with json.load()?.

Upvotes: 0

Related Questions