Muhammad Arsalan Toor
Muhammad Arsalan Toor

Reputation: 201

python requests url to fetch record simplejson.errors.JSONDecodeError:

I am trying to fetch table record by using requests method in python it's working fine with other URL request but the URL I have mentioned in the code its actually biometric machine data present int the link in form of a table. I have mentioned the code below please review and let me know have a possible solution. actually I want to fetch attendance record import requests import sys, JSON import requests import sys, JSON

 import json
 import requests
 import urllib
 url = "http://pokemondb.net/pokedex/all"

 data = requests.get(url).json()
 print(str(data.status_code))
 if data.status_code == 200:
    print(str(data.json()))

Error

Traceback (most recent call last): File "/home/arslan/Documents/Dynexcel/http_request_html_table/http_request.py", line 7, in data = requests.get(url).json() File "/usr/lib/python3/dist-packages/requests/models.py", line 897, in json return complexjson.loads(self.text, **kwargs) File "/usr/lib/python3/dist-packages/simplejson/init.py", line 518, in loads return _default_decoder.decode(s) File "/usr/lib/python3/dist-packages/simplejson/decoder.py", line 370, in decode obj, end = self.raw_decode(s) File "/usr/lib/python3/dist-packages/simplejson/decoder.py", line 400, in raw_decode return self.scan_once(s, idx=_w(s, idx).end()) simplejson.errors.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Upvotes: 0

Views: 1082

Answers (1)

Lukas Räpple
Lukas Räpple

Reputation: 133

I hope i'm not missing something, but the website you are trying to retrieve data from is obviously html and not json. So python gives you a Decode Error at line:

 requests.get(url).json()

So you have multiple options to parse this html and extract data. One option would be to get the html with requests and later extract the data with bs4. I don't see any column "attendance" in this website by the way. Below is a code example, which fetches the highest total score and the name of the related pokemon:

import json
import requests
import urllib
from bs4 import BeautifulSoup
url = "http://pokemondb.net/pokedex/all"

all_pokemons = []
all_scores = []

data = requests.get(url)

soup = BeautifulSoup(data.text, 'html.parser')

all_td = soup.findAll("td", {"class": "cell-total"})

all_pokemon_hrefs = soup.findAll("a", {"class":"ent-name"})




for pokemon in all_pokemon_hrefs:

    all_pokemons.append(pokemon.renderContents().decode("utf-8"))

for td in all_td:

    all_scores.append(td.renderContents().decode("utf-8"))


print("Pokemon {} has the highest total score with: {}".format(str(all_pokemons[all_scores.index(max(all_scores))]),str(max(all_scores))))

Upvotes: 0

Related Questions