Reputation: 345
I am trying to get all the websites from this json file
Unfortunately, when I use this code:
import requests
response = requests.get("https://github.com/solana-labs/token-list/blob/main/src/tokens/solana.tokenlist.json")
output = response.json()
# Extract specific node content.
print(output['website'])
I get following error:
Traceback (most recent call last):
File "/Users/dusandev/Desktop/StreamFlowWebTests/extract.py", line 5, in <module>
output = response.json()
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-
packages/requests/models.py", line 900, in json
return complexjson.loads(self.text, **kwargs)
File
"/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/json/__init__.py",
line 346, in loads
return _default_decoder.decode(s)
File
"/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/json/decoder.py",
line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File
"/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/json/decoder.py",
line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 7 column 1 (char 6)
Any help is appreciated. Thank you in advance
Upvotes: 0
Views: 254
Reputation: 2561
This error usually means that the output can not be parsed as a json.
you have 2 options:
import requests
response = requests.get("https://raw.githubusercontent.com/solana-labs/token-list/main/src/tokens/solana.tokenlist.json")
output = response.json()
first_website = output["tokens"][0]["extensions"]["website"]
#all websites:
for token in output['tokens']:
if extensions := token.get('extensions'): print(extensions.get('website'))
#output:
'https://www.angle.money'
BeautifulSoup
- https://www.dataquest.io/blog/web-scraping-python-using-beautiful-soup/Upvotes: 0
Reputation: 2795
Use raw
data to get raw json
and then iterate over 'tokens
' attr
of the response
object:
import requests
response = requests.get(
"https://raw.githubusercontent.com/solana-labs/token-list/main/src/tokens/solana.tokenlist.json")
output = response.json()
for i in output['tokens']:
if i.get('extensions'):
print(i.get('extensions').get('website'))
Upvotes: 1
Reputation: 2556
If you visit the url https://github.com/solana-labs/token-list/blob/main/src/tokens/solana.tokenlist.json in a browser, you'll get a fully rendered web page. In order to get just JSON you need to use the "view raw" link. That winds up being
https://raw.githubusercontent.com/solana-labs/token-list/main/src/tokens/solana.tokenlist.json
You will then have several thousand elements in the array attached to the "tokens" key in the response dictionary. To get the website element you'll need to iterate through the list and look at the "extensions"
>>> output["tokens"][0]
{'chainId': 101, 'address': 'CbNYA9n3927uXUukee2Hf4tm3xxkffJPPZvGazc2EAH1', 'symbol': 'agEUR', 'name': 'agEUR (Wormhole)', 'decimals': 8, 'logoURI': 'https://raw.githubusercontent.com/solana-labs/token-list/main/assets/mainnet/CbNYA9n3927uXUukee2Hf4tm3xxkffJPPZvGazc2EAH1/logo.png', 'tags': ['ethereum', 'wrapped', 'wormhole'], 'extensions': {'address': '0x1a7e4e63778B4f12a199C062f3eFdD288afCBce8', 'assetContract': 'https://etherscan.io/address/0x1a7e4e63778B4f12a199C062f3eFdD288afCBce8', 'bridgeContract': 'https://etherscan.io/address/0x3ee18B2214AFF97000D974cf647E7C347E8fa585', 'coingeckoId': 'ageur', 'description': 'Angle is the first decentralized, capital efficient and over-collateralized stablecoin protocol', 'discord': 'https://discord.gg/z3kCpTaKMh', 'twitter': 'https://twitter.com/AngleProtocol', 'website': 'https://www.angle.money'}}
>>> output["tokens"][0]["extensions"]["website"]
'https://www.angle.money'
Upvotes: 0
Reputation: 9508
The file https://github.com/solana-labs/token-list/blob/main/src/tokens/solana.tokenlist.json is not a json. Use https://raw.githubusercontent.com/solana-labs/token-list/main/src/tokens/solana.tokenlist.json instead.
Upvotes: 0