Reputation: 690
Code below does not fail but it is not complete. From this point I am trying to only get all the fullgame values into a dataframe.
import json
from bs4 import BeautifulSoup
import urllib.request
source = urllib.request.urlopen('https://www.oddsshark.com/nfl/odds').read()
soup = BeautifulSoup(source, 'html.parser')
results = soup.find_all(class_ = "op-item op-spread op-opening")
for result in (results):
print(json.loads(result['data-op-info']).items())
I used print at the end as I was trying to extract line value only and see it.
Note there is a similar question on this site but the solution only works for one div. It will fail if variable has multiple divs.
How to parse information between {} on web page using Beautifulsoup
Upvotes: 0
Views: 502
Reputation: 5648
You were almost there. See where I have the list comprehension to captures the results then use json_normalize()
import json
from bs4 import BeautifulSoup
import urllib.request
source = urllib.request.urlopen('https://www.oddsshark.com/nfl/odds').read()
soup = BeautifulSoup(source, 'html.parser')
results = soup.find_all(class_ = "op-item op-spread op-opening")
rlist = [json.loads(result['data-op-info']) for result in (results)]
pd.json_normalize(rlist)
fullgame firsthalf secondhalf firstquarter secondquarter thirdquarter fourthquarter
0 -4.5 -2.5 -1.5 -0.5 -0.5 -0.5 -0.5
1 +4.5 +2.5 +1.5 +0.5 +0.5 +0.5 +0.5
2 +7 +4 +3.5 +3 +3 +2.5 +2
3 -7 -4 -3.5 -3 -3 -2.5 -2
4 -3 -3 -2.5 -0.5 -2 -0.5 -0.5
5 +3 +3 +2.5 +0.5 +2 +0.5 +0.5
6 +3 +2.5 +0.5 +0.5 +0.5 +0.5 +0.5
7 -3 -2.5 -0.5 -0.5 -0.5 -0.5 -0.5
8 -3 -0.5 -0.5 -0.5 -0.5 -0.5 -0.5
9 +3 +0.5 +0.5 +0.5 +0.5 +0.5 +0.5
10 -3 -2.5 -1 -0.5 -1 -0.5 -0.5
11 +3 +2.5 +1 +0.5 +1 +0.5 +0.5
12 -1 +0.5 -0.5 +0.5 -0.5 -0.5 -0.5
13 +1 -0.5 +0.5 -0.5 +0.5 +0.5 +0.5
14 +2.5 +3.5 +3 +0.5 +2.5 +0.5 +1
15 -2.5 -3.5 -3 -0.5 -2.5 -0.5 -1
16 +4 +3 +2 +0.5 +1 +0.5 +0.5
17 -4 -3 -2 -0.5 -1 -0.5 -0.5
18 -2.5 -0.5 -0.5 +0.5 -0.5 -0.5 -0.5
19 +2.5 +0.5 +0.5 -0.5 +0.5 +0.5 +0.5
20 -2.5 -1.5 -0.5 -0.5 -0.5 -0.5 -0.5
21 +2.5 +1.5 +0.5 +0.5 +0.5 +0.5 +0.5
22 +2.5 +1.5 +0.5 +0.5 +0.5 +0.5 +0.5
23 -2.5 -1.5 -0.5 -0.5 -0.5 -0.5 -0.5
24 +1.5 +1.5 Ev +0.5 -0.5 -0.5 -0.5
25 -1.5 -1.5 Ev -0.5 +0.5 +0.5 +0.5
26 +5.5 +3 +2.5 +0.5 +0.5 +0.5 +0.5
27 -5.5 -3 -2.5 -0.5 -0.5 -0.5 -0.5
28 -3.5 -0.5 Ev -0.5 +0.5 +0.5 +0.5
29 +3.5 +0.5 Ev +0.5 -0.5 -0.5 -0.5
30 -5
31 +5
Or, if you really just want one key from the dictionary:
rlist = [json.loads(result['data-op-info'])['fullgame'] for result in (results)]
pd.DataFrame({'fullgame': rlist})
fullgame
0 -4.5
1 +4.5
2 +7
3 -7
4 -3
5 +3
6 +3
7 -3
8 -3
9 +3
10 -3
11 +3
12 -1
13 +1
14 +2.5
15 -2.5
16 +4
17 -4
18 -2.5
19 +2.5
20 -2.5
21 +2.5
22 +2.5
23 -2.5
24 +1.5
25 -1.5
26 +5.5
27 -5.5
28 -3.5
29 +3.5
30 -5
31 +5
Upvotes: 1