Reputation: 857
I am scraping 'common wealth games' medal counts from this page : https://en.wikipedia.org/wiki/1930_British_Empire_Games
Once the data is scraped I want to move to next page. To do so I want to select a <table>
tag which has an attribute ID '#collapsibleTable1'
.
Now here comes the interesting part. When I do $('#collapsibleTable1')
on chrome console, I get the desired output.
However, when I try to do response.css('#collapsibleTable1')
in scrapy shell, it gives an empty list.
It would be of great help if somebody could explain why it's behaving this way.
Upvotes: 0
Views: 825
Reputation: 859
I had the same problem, just started on web crawling, and found out I couldn't scrape certain contents from a website. As stranac put it, some contents are rendered by the javascript dynamically, we need to go to data source for solution.
Adding my answer, as some people like me didn't how to start and might need some directions, please see the official documents in scrapy on how to get the data from the data source, there are multiple ways to handle it based on your situation.
My understanding from the above is, there are 2 ways to deal with this problem:
More details are covered in the official doc. Hope the reference helps.
Upvotes: 0
Reputation: 28256
It looks like there is some javascript manipulation happening, as that id isn't contained in the actual HTML source (which you can see if you print(response.text)
)
Chrome's dev tools will show the current state of the DOM after all the javascript has been executed, which is not what scrapy sees.
Looking at the source, the data you want is shown as:
<table class="nowraplinks collapsible autocollapse navbox-inner" style="border-spacing:0;background:transparent;color:inherit">
<tr>
<th scope="col" class="navbox-title" colspan="2">
<div class="plainlinks hlist navbar mini">
<ul>
<li class="nv-view"><a href="/wiki/Template:Commonwealth_Games_Medal_Counts" title="Template:Commonwealth Games Medal Counts"><abbr title="View this template" style=";;background:none transparent;border:none;-moz-box-shadow:none;-webkit-box-shadow:none;box-shadow:none;">v</abbr></a></li>
<li class="nv-talk"><a href="/wiki/Template_talk:Commonwealth_Games_Medal_Counts" title="Template talk:Commonwealth Games Medal Counts"><abbr title="Discuss this template" style=";;background:none transparent;border:none;-moz-box-shadow:none;-webkit-box-shadow:none;box-shadow:none;">t</abbr></a></li>
<li class="nv-edit"><a class="external text" href="//en.wikipedia.org/w/index.php?title=Template:Commonwealth_Games_Medal_Counts&action=edit"><abbr title="Edit this template" style=";;background:none transparent;border:none;-moz-box-shadow:none;-webkit-box-shadow:none;box-shadow:none;">e</abbr></a></li>
</ul>
</div>
<div id="Commonwealth_Games_medal_tables" style="font-size:114%;margin:0 4em"><a href="/wiki/All-time_Commonwealth_Games_medal_table" title="All-time Commonwealth Games medal table">Commonwealth Games medal tables</a></div>
</th>
</tr>
<tr>
<td colspan="2" class="navbox-list navbox-odd hlist" style="width:100%;padding:0px">
<div style="padding:0em 0.25em">
<ul>
<li><a href="/wiki/1930_British_Empire_Games#Medal_table" title="1930 British Empire Games">1930</a></li>
<li><a href="/wiki/1934_British_Empire_Games#Medals_by_country" title="1934 British Empire Games">1934</a></li>
<li><a href="/wiki/1938_British_Empire_Games#Medals_by_country" title="1938 British Empire Games">1938</a></li>
<li><a href="/wiki/1950_British_Empire_Games#Medals_by_country" title="1950 British Empire Games">1950</a></li>
<li><a href="/wiki/1954_British_Empire_and_Commonwealth_Games#Medal_table" title="1954 British Empire and Commonwealth Games">1954</a></li>
<li><a href="/wiki/1958_British_Empire_and_Commonwealth_Games#Medals_by_country" title="1958 British Empire and Commonwealth Games">1958</a></li>
<li><a href="/wiki/1962_British_Empire_and_Commonwealth_Games#Medals_by_country" title="1962 British Empire and Commonwealth Games">1962</a></li>
<li><a href="/wiki/1966_British_Empire_and_Commonwealth_Games#Medals_by_country" title="1966 British Empire and Commonwealth Games">1966</a></li>
<li><a href="/wiki/1970_British_Commonwealth_Games#Medals_by_country" title="1970 British Commonwealth Games">1970</a></li>
<li><a href="/wiki/1974_British_Commonwealth_Games#Medals_by_country" title="1974 British Commonwealth Games">1974</a></li>
<li><a href="/wiki/1978_Commonwealth_Games#Medals_by_country" title="1978 Commonwealth Games">1978</a></li>
<li><a href="/wiki/1982_Commonwealth_Games#Medals_by_country" title="1982 Commonwealth Games">1982</a></li>
<li><a href="/wiki/1986_Commonwealth_Games#Medals_by_country" title="1986 Commonwealth Games">1986</a></li>
<li><a href="/wiki/1990_Commonwealth_Games#Medals_by_country" title="1990 Commonwealth Games">1990</a></li>
<li><a href="/wiki/1994_Commonwealth_Games#Medal_table" title="1994 Commonwealth Games">1994</a></li>
<li><a href="/wiki/1998_Commonwealth_Games#Medal_table" title="1998 Commonwealth Games">1998</a></li>
<li><a href="/wiki/2002_Commonwealth_Games#Final_medal_table" title="2002 Commonwealth Games">2002</a></li>
<li><a href="/wiki/2006_Commonwealth_Games_medal_table" title="2006 Commonwealth Games medal table">2006</a></li>
<li><a href="/wiki/2010_Commonwealth_Games_medal_table" title="2010 Commonwealth Games medal table">2010</a></li>
<li><a href="/wiki/2014_Commonwealth_Games_medal_table" title="2014 Commonwealth Games medal table">2014</a></li>
<li><a href="/wiki/2018_Commonwealth_Games_medal_table" title="2018 Commonwealth Games medal table">2018</a></li>
</ul>
</div>
</td>
</tr>
</table>
Upvotes: 1