can't scrape web page from BeautifulSoup or lxml

Question

I am very new to programming so this can be a silly question.I wanted to learn to scrape web pages. so I learned BeautifulSoup to do it.....worked for few sites but got stuck on the following page

from bs4 import BeautifulSoup
import requests

r  = requests.get("http://www.dlb.today/result/en")
data = r.text
soup = BeautifulSoup(data, "lxml")

data = soup.find_all("tbody", {"id": "pageData1"})
data2 = soup.find_all("ul", {"class": "res_allnumber"})
print data
print data2
#no point going further if I cant get raw data I think

this worked fine (a similar site I scraped)

r2  = requests.get("http://www.nlb.lk/results-more.php?id=1")
data2 = r2.text
soup2 = BeautifulSoup(data2, "lxml")
news2 = soup2.find_all("a", {"class": "lottery-numbers"})
#print news2 #(get raw Html for checking)
for draw_number in news2:
   print draw_number.contents[0]

I couldn't scrape the table I wanted.so I tried LXML to do it...still no luck.............

#lxml
import requests

r  = requests.get("http://www.dlb.today/result/en")
data = r.text

#print data

import lxml.html as LH

content = data
root = LH.fromstring(content)
for tag1 in root.xpath('//tbody[@class="pageData1"]//li'):  
    print tag1.text_content()

I don't know where is my error or what to do next......if anyone can anyone point me in the right direction I appreciate it !

Dan-Dev · Accepted Answer

There is JavaScript involved in loading data to display this page. Fortunately the JavaScript loads another HTML page from the URL

http://www.dlb.today/result/pagination_re

You can access this URL with a POST request directly like this:

import requests
from bs4 import BeautifulSoup

url = "http://www.dlb.today/result/pagination_re"
data = {"pageId": "0", "resultID": "1001", "lotteryID": "1", "lastsegment": "en"}
page = requests.post(url, data)
soup = BeautifulSoup(page.content,'html.parser')
for data in soup.find_all("ul", {"class": "res_allnumber"}):
    print (data)

You may have to experiment with the "data" values to get exactly what you want!

The output is:

can't scrape web page from BeautifulSoup or lxml

Answers (2)

Related Questions

can&#39;t scrape web page from BeautifulSoup or lxml

Answers (2)

Related Questions

can't scrape web page from BeautifulSoup or lxml