user1944267
user1944267

Reputation: 1637

how to crawl web page with ajax elements

I want to crawl some web pages, like the following

http://www.youtube.com/user/koglin66/feed?filter=2

but there is a 'load more' button, it is related to an ajax request

http://www.youtube.com/channel_ajax?action_load_more_feed_items=1&activity_view=1&paging=1352148528&channel_id=UCCw8aVnsIeu9S6OPQyaQ14g

I want to crawl the whole page. Manually, I have click on the button repeatedly until there is no more to load, by automation, how can I crawl the whole page? thanks!

Upvotes: 0

Views: 983

Answers (2)

user2647646
user2647646

Reputation: 101

Yes, you can use Selenium IDE, or use other program/library with browser core to do click action. Like webkit, activex of IE.

And you can try FMiner http://www.fminer.com/, it can record and play human actions on browser to scrape data, but it's not free.

Upvotes: 1

Pratik
Pratik

Reputation: 1

I recently faced same problem with other website I wanted to scrap. I use Java and after some research on the web I used Selenium IDE for firefox in which u can write Java Junit test cases which will automatically open the webpage and click buttons, fill up forms, etc. It also supports C#,Python,Ruby,etc

I used it to click on Load More button and when the page was loaded completely after all clicks I saved it Manually.

You can download Selenium from their website and I found this youtube video useful too http://www.youtube.com/watch?v=twdDfDOrHC4

Upvotes: 0

Related Questions