Reputation: 63
Good day. I'm facing an issue while trying to extract values from json. First of all my beautifulsoup works very fine in the shell, but not in django. and also what I'm trying to achieve is extracting data from the received json, but with no success. Here's the class in my view doing it:
class FetchWeather(generic.TemplateView):
template_name = 'forecastApp/pages/weather.html'
def get_context_data(self, **kwargs):
context = super().get_context_data(**kwargs)
url = 'http://weather.news24.com/sa/cape-town'
city = 'cape town'
url_request = requests.get(url)
soup = BeautifulSoup(url_request.content, 'html.parser')
city_list = soup.find(id="ctl00_WeatherContentHolder_ddlCity")
print(soup.head)
city_as_on_website = city_list.find(text=re.compile(city, re.I)).parent
cityId = city_as_on_website['value']
json_url = "http://weather.news24.com/ajaxpro/TwentyFour.Weather.Web.Ajax,App_Code.ashx"
headers = {
'Content-Type': 'text/plain; charset=UTF-8',
'Host': 'weather.news24.com',
'Origin': 'http://weather.news24.com',
'Referer': url,
'User-Agent': 'Mozilla/5.0 (X11; Linux i686) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/48.0.2564.82 Chrome/48.0.2564.82 Safari/537.36',
'X-AjaxPro-Method': 'GetCurrentOne'}
payload = {
"cityId": cityId
}
request_post = requests.post(json_url, headers=headers, data=json.dumps(payload))
print(request_post.content)
context['Observations'] = request_post.content
return context
In the json, there's a array "Observations" from which I'm trying to get the city name, the temperature high and low.
but when I tried to do this:
cityDict = json.loads(str(html))
I'm receiving an error. Here's the traceback to it:
Traceback (most recent call last):
File "<console>", line 1, in <module>
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/json/__init__.py", line 319, in loads
return _default_decoder.decode(s)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/json/decoder.py", line 339, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/json/decoder.py", line 357, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 4067 (char 4066)
any help will be gladly appreciated.
Upvotes: 1
Views: 331
Reputation: 473863
There are two problems with your JSON data inside request_post.content
:
there are JS date object values there, for instance:
"Date":new Date(Date.UTC(2016,1,26,22,0,0,0))
there are unwanted characters at the end: ;/*"
.
Let's clean the JSON data so that it can be loaded with json
:
from datetime import datetime
data = request_post.text
def convert_date(match):
return '"' + datetime(*map(int, match.groups())).strftime("%Y-%m-%dT%H:%M:%S") + '"'
data = re.sub(r"new Date\(Date\.UTC\((\d+),(\d+),(\d+),(\d+),(\d+),(\d+),(\d+)\)\)",
convert_date,
data)
data = data.strip(";/*")
data = json.loads(data)
context['Observations'] = data
Upvotes: 1