Reputation: 409
I am trying to pull the JSON code from a urllib.request
object focusing on twitter. I am doing this out of curiosity and also because I am trying to determine what to request with Scrappy
in order to write code that bypasses twitter's infinite scrolling and allows me to pull all the tweets off a user's timeline.
(I know there are some packages that already do this but I want to set it up by myself to learn by doing :) )
I have been using the urllib
package to get the request data, however, I have been running into a frustrating error when I attempt it:
import json
import urllib
with urllib.request.urlopen("https://twitter.com/vonkraush") as url:
data = url.read().decode()
print(json.loads(data))
Traceback (most recent call last):
File "<ipython-input-30-208336effb36>", line 1, in <module>
json.loads(data)
File "C:\Users\Josh\Anaconda3\lib\json\__init__.py", line 354, in loads
return _default_decoder.decode(s)
File "C:\Users\Josh\Anaconda3\lib\json\decoder.py", line 339, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "C:\Users\Josh\Anaconda3\lib\json\decoder.py", line 357, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
JSONDecodeError: Expecting value
I've tried expressly passing 'utf-8'
into the decoding and a few other methods, but nothing has so far allowed my to bypass this error. What am I doing wrong and how can I fix it?
Upvotes: 0
Views: 353
Reputation: 81
You are doing it wrong. This URL will always return you an HTML page. To get user
data from Twitter
use Twitter Dev API
.
See here, Twitter Dev API might help you to extract information from Twitter. But for that to you will have to authenticate yourself as a Twitter user
. Make sure you create a Twitter app first and get your OAuth key. It will be your access to Twitter API.
Twitter API uses token based authentication. The Token that you will receive in response from the API call will be your identity as a user
.
Upvotes: 1