Reputation: 956
In html i have elements like this:
<script class="ember-view" id="ember36032292" name="schema:podcast-show" type="application/ld+json">
{"@context":"http://schema.org","@type":"CreativeWork","name":"A2C Random talk","author":"a2crandom","description":"We tackle tech. We tackle tv. We tackle everything","datePublished":"Oct 12, 2015","offers":[{"@type":"Offer","price":"Free"}],"review":[],"workExample":[{"@type":"AudioObject","name":"just a test for itunes","datePublished":"Oct 12, 2015","description":"test test test","duration":"PT7S","requiresSubscription":"no"}]}
</script>
How can I get this string as a dictionary? I get this line like this:
description = soup.find('script', {'name': 'schema:podcast-show'}).get_text()
Upvotes: 0
Views: 36
Reputation: 43495
It says type="application/ld+json"
, which is a form of JSON.
So we use json.loads
:
In [1]: import json
In [2]: json.loads('''{"@context":"http://schema.org","@type":"CreativeWork","name":"A2C Random talk","au
...: thor":"a2crandom","description":"We tackle tech. We tackle tv. We tackle everything","datePublish
...: ed":"Oct 12, 2015","offers":[{"@type":"Offer","price":"Free"}],"review":[],"workExample":[{"@type
...: ":"AudioObject","name":"just a test for itunes","datePublished":"Oct 12, 2015","description":"tes
...: t test test","duration":"PT7S","requiresSubscription":"no"}]}''')
Out[2]:
{'@context': 'http://schema.org',
'@type': 'CreativeWork',
'name': 'A2C Random talk',
'author': 'a2crandom',
'description': 'We tackle tech. We tackle tv. We tackle everything',
'datePublished': 'Oct 12, 2015',
'offers': [{'@type': 'Offer', 'price': 'Free'}],
'review': [],
'workExample': [{'@type': 'AudioObject',
'name': 'just a test for itunes',
'datePublished': 'Oct 12, 2015',
'description': 'test test test',
'duration': 'PT7S',
'requiresSubscription': 'no'}]}
Upvotes: 2