Reputation: 739
in python i'm reading an html page content which contains a lot of stuff. To do this i read the webpage as string by this way:
url = 'https://myurl.com/'
reqq = req.Request(url, headers={'User-Agent': 'Mozilla/5.0'})
reddit_file = req.urlopen(reqq)
reddit_data = reddit_file.read().decode('utf-8')
if i print the reddit_data
i can see correctly the whole html contents.
Now, inside it there's a structure like json that i would like to read and extract some fields from that.
Below the structure:
"dealDetails" : {
"f240141a" : {
"egressUrl" : "https://ccc.com",
"title" : "ZZZ",
"type" : "ghi",
},
"5f9ab246" : {
"egressUrl" : "https://www.bbb.com/",
"title" : "YYY",
"type" : "def",
},
"2bf6723b" : {
"egressUrl" : "https://www.aaa.com//",
"title" : "XXX",
"type" : "abc",
},
}
What i want to do is: find the dealDetails
field and then for each f240141a
5f9ab246
2bf6723b
get the egressURL, title and type values.
Thanks
Upvotes: 0
Views: 158
Reputation: 59
[(i['egressUrl'], i['title'], i['type']) for i in reddit_data['dealDetails'].keys()]
dictionary = eval(reddit_data)
This will convert the whole file into a dictionary, I recommend that you only use it on the part of the text that 'looks' like a dictionary! (One of the reason eval is unpopular, is because it won't convert strings like 'true'/'false' to Python's True/False, be careful with that :) )
Hope that helped!
Upvotes: 0
Reputation: 5785
Try this,
[nested_dict['egressUrl'] for nested_dict in reddit_data['dealDetails'].keys()]
To access the values of JSON, you can consider as dictionary and use the same syntax to access values as well.
Edit-1:
Make sure your type of reddit_data is a dictionary.
if type(reddit_data)
is str
.
You need to do..
import ast
reddit_data = ast.literal_eval(reddit_data)
OR
import json
reddit_data = json.loads(reddit_data)
Upvotes: 3