Gerasimos
Gerasimos

Reputation: 319

json.decoder.JSONDecodeError: Expecting value: line 2 column 1 (char 2)

I am trying to extract the text from a .json file which i extracted. The problem is, that every time I am trying I am getting the aforementioned error(title). Here is my code:

import json

with open('grtwe.json.json', 'r') as f:
    line = f.readline()
    tweet = json.loads(line)
    print(json.dumps(tweet, indent=4))

Also tweets are greek.

The first line of my .json file is this

{"place": null, "geo": null, "source": "<a href=\"" rel=\"nofollow\">Twitter Lite</a>", "id_str": "967369573505921024", "favorite_count": 0, "in_reply_to_status_id": null, "favorited": false, "in_reply_to_user_id": null, "in_reply_to_status_id_str": null, "contributors": null, "is_quote_status": false, "full_text": "RT @documentonews:#Novartis_gate\n\u0391\u03c0\u03bf\u03ba\u03ac\u03bb\u03c5\u03c8\u03b7-\u03c3\u03bf\u03ba: \u039a\u03b1\u03b9 \u03c4\u03c1\u03af\u03c4\u03bf\u03c2 \u03bd\u03b5\u03ba\u03c1\u03cc\u03c2 \u03c3\u03c4\u03bf \u03b4\u03c1\u03cc\u03bc\u03bf \u03c4\u03b7\u03c2 Novartis, \u03c3\u03c4\u03bf Documento \u03c0\u03bf\u03c5 \u03ba\u03c5\u03ba\u03bb\u03bf\u03c6\u03bf\u03c1\u03b5\u03af \u03c4\u03b7\u03bd \u039a\u03c5\u03c1\u03b9\u03b1\u03ba\u03ae | https\u2026", "truncated": false, "user": {"notifications": false, "is_translator": false, "profile_image_url": "", "profile_background_tile": false, "id_str": "387685829", "geo_enabled": false, "profile_image_url_https":"", "statuses_count": 47093, "screen_name": "satrapis21", "is_translation_enabled": false, "followers_count": 1692, "has_extended_profile": false, "profile_background_image_url_https": "", "url": null, "follow_request_sent": false, "profile_sidebar_border_color": "FFFFFF", "profile_use_background_image": true, "profile_link_color": "D02B55", "profile_text_color": "3E4415", "description":"\u03be\u03b5\u03bd\u03bf\u03b4\u03bf\u03c7\u03bf\u03c2 \u03b3\u03ba\u03bf\u03c5\u03bb\u03b1\u03b3\u03ba \u03b5\u03c0\u03b5\u03bd\u03b4\u03c5\u03c4\u03b7\u03c2", "profile_background_color": "352726", "id": 387685829, "friends_count": 1689, "favourites_count": 3380, "created_at": "Sun Oct 09 14:01:48 +0000 2011", "default_profile": false, "translator_type": "none", "entities": {"description": {"urls": []}}, "profile_sidebar_fill_color": "99CC33", "default_profile_image": false, "listed_count": 39, "profile_banner_url": "","following": false, "utc_offset": 7200, "protected": false, "verified": false, "name": "\u03ba\u03bf\u03c5\u03bb\u03b7\u03c2satrapis", "profile_background_image_url":"", "time_zone": "Vilnius", "lang": "el", "contributors_enabled": false,"location": ""}, "metadata": {"result_type": "recent", "iso_language_code": "el"}, "id": 967369573505921024, "in_reply_to_screen_name": null, "created_at": "Sat Feb 2412:04:13 +0000 2018", "display_text_range": [0, 140], "retweeted": false, "in_reply_to_user_id_str": null, "lang": "el", "coordinates": null, "retweeted_status": {"place": null, "geo": null, "source": "<a href=\"http://twitter.com\" rel=\"nofollow\">Twitter Web Client</a>", "id_str": "967369433864863744", "favorite_count": 13, "in_reply_to_status_id": null, "favorited": false, "in_reply_to_user_id": null, "in_reply_to_status_id_str": null, "contributors": null, "is_quote_status": false,"full_text": "#Novartis_gate\n\u0391\u03c0\u03bf\u03ba\u03ac\u03bb\u03c5\u03c8\u03b7-\u03c3\u03bf\u03ba: \u039a\u03b1\u03b9 \u03c4\u03c1\u03af\u03c4\u03bf\u03c2 \u03bd\u03b5\u03ba\u03c1\u03cc\u03c2 \u03c3\u03c4\u03bf \u03b4\u03c1\u03cc\u03bc\u03bf \u03c4\u03b7\u03c2 Novartis, \u03c3\u03c4\u03bf Documento \u03c0\u03bf\u03c5 \u03ba\u03c5\u03ba\u03bb\u03bf\u03c6\u03bf\u03c1\u03b5\u03af \u03c4\u03b7\u03bd \u039a\u03c5\u03c1\u03b9\u03b1\u03ba\u03ae | ","truncated": false, "user": {"notifications": false, "is_translator": false, "profile_image_url": "", "profile_background_tile": false, "id_str": "795738344906952705", "geo_enabled": false, "profile_image_url_https": "", "statuses_count": 39383, "screen_name": "documentonews", "is_translation_enabled": false, "followers_count": 4607, "has_extended_profile": false, "profile_background_image_url_https": null, "url": "", "follow_request_sent": false, "profile_sidebar_border_color": "C0DEED", "profile_use_background_image": true, "profile_link_color": "1DA1F2", "profile_text_color": "333333", "description": "H \u039d\u03ad\u03b1 \u039c\u03b5\u03b3\u03ac\u03bb\u03b7 \u039a\u03c5\u03c1\u03b9\u03b1\u03ba\u03ac\u03c4\u03b9\u03ba\u03b7 \u0395\u03c6\u03b7\u03bc\u03b5\u03c1\u03af\u03b4\u03b1", "profile_background_color": "F5F8FA", "id": 795738344906952705, "friends_count": 180, "favourites_count": 0, "created_at": "Mon Nov 07 21:23:00 +0000 2016", "default_profile": true, "translator_type": "none", "entities": {"url": {"urls": [{"url": "", "display_url": "documentonews.gr", "expanded_url": "", "indices": [0, 23]}]}, "description": {"urls": []}}, "profile_sidebar_fill_color": "DDEEF6", "default_profile_image": false, "listed_count": 69, "profile_banner_url": "", "following": false,"utc_offset": 7200, "protected": false, "verified": false, "name": "Documento", "profile_background_image_url": null, "time_zone": "Athens", "lang": "en", "contributors_enabled": false, "location": "Greece"},"metadata": {"result_type": "recent", "iso_language_code": "el"}, "id": 967369433864863744, "in_reply_to_screen_name": null, "created_at": "Sat Feb 24 12:03:40 +0000 2018", "display_text_range": [0, 162],"retweeted": false, "in_reply_to_user_id_str": null, "lang": "el", "coordinates": null, "entities": {"hashtags": [{"text": "Novartis_gate", "indices": [0, 14]}], "user_mentions": [], "symbols": [], "urls": [{"url": "", "display_url": "Documentonews.gr", "expanded_url": "", "indices": [115, 138]}, {"url": "", "display_url":"documentonews.gr/article/apokal\u2026", "expanded_url": "", "indices": [139, 162]}]}, "possibly_sensitive": false, "retweet_count": 10}, "entities": {"hashtags": [{"text": "Novartis_gate", "indices": [19, 33]}], "user_mentions": [{"name": "Documento", "id": 795738344906952705,"screen_name": "documentonews", "id_str": "795738344906952705", "indices": [3, 17]}], "symbols": [], "urls": []}, "possibly_sensitive": false, "retweet_count": 10}

The rest of the file contains such records.

Upvotes: 0

Views: 8265

Answers (2)

Mohabassam
Mohabassam

Reputation: 1

loads() triggered the error!
and so the problem is in the json file which appears to be not in the json format.

you need to get sure that the json file is in the correct json format
you could go to https://jsonlint.com and past your json string there
and it will tell you if it is in the right format.

Upvotes: 0

Joe Iddon
Joe Iddon

Reputation: 20414

This is most likely because you are trying to only parse the first line of the file (since you call json.loads() on f.readline()). It sounds more probable, that your whole file is JSON - in which case you want to pass the whole thing in one go.

with open('grtwe.json.json', 'r') as f:
    tweet = json.loads(f.read())
    print(json.dumps(tweet, indent=4))

However, I obviously can't check without the file!

Upvotes: 1

Related Questions