Reputation: 974
I have this start url for crawling.
When I send a request from scrapy shell, it is crawled without any problems. I can see the full page is rendered when I use view(response)
. This is the HTML code and the rendered website.
However, when I try to use selectors to get a tags, they don't work. It's like the whole HTML table body is not there.
response.css('tbody').getall()
gets an empty table body or the a tags I'm looking for are not there.
I also checked the whether there is an AJAX request which I'm missing, but there is not. What's the problem here?
Upvotes: 0
Views: 32
Reputation: 10666
You need to check source HTML code (usually Ctrl+U
in a browser) for the source data. For your URL you'll find that target table is loaded from JavaScript code starting with var COLLECTION = [
. You can parse that part with below code:
import json
def parse(self, response):
json_collection = response.xpath('//script[contains(., "var COLLECTION = [")]').re_first(r'var COLLECTION = (\[.+?\]);')
data = json.loads(json_collection) # now you have everything you need here
for element in data:
mark = element["mar"]
version = element["ver"]
........
Upvotes: 1