Reputation: 447
I'm parsing http://www.treccani.it/lingua_italiana/sinonimi_regionali/ using python3 and beautifulsoup. I've parsed first page and I need to go to the second page, to third and etc. Moving to another page is made by button(image):
<div class="next">
<a href="#" onClick="doSearch(1, 4, 37); return false;" title="Pagina successiva">
<img src="/export/system/modules/it.banzai.treccani.portale3/resources/images/arrow-right.png" />
</a>
</div>
Please tell me, how can I get the link to go to next page? Or how can I move between pages using python?
Upvotes: 0
Views: 329
Reputation: 1172
The problem with using BeautifulSoup is that it returns a static page to you if the link is not in the html you cannot get it using BeautifulSoup as it is simply a parser and does not run the page.
As mentioned in the other answers a good approach to use this is selenium, You could also try and find the doSearch
JavaScript work out what it is doing a replicate it on your python end this does seem a little messy though. After looking at the doSearch function selenium seems like your best shot.
Upvotes: 1
Reputation: 51877
I think you're going to need a Javascript engine, rather than Beautiful Soup.
One good approach is using browser automation via Selenium. Unless you feel like guessing - because you'll have to know what the doSearch
function is actually doing, and if they change the Javascript then your code will no longer do what you expect.
Upvotes: 1