Malvi Patel
Malvi Patel

Reputation: 106

Scraping Movie Reviews from IMDB website using beautifulSoup library of python


I want to scrape all the reviews for a particular movie from the IMDB website. I have used the 'Html-parser' of BeautifulSoup package for the same.

Link

Consider this link, I want to scrape all the movie reviews (i.e. Total = 69) for this movie but since 25 reviews are visible on-page, Soup will extract only 25 reviews instead of Total reviews here.

My Code:

url = "https://www.imdb.com/title/tt6654210/reviews?ref_=tt_ov_rt"
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
review_list = getReviewsList(soup)
len(review_list)

Output:

25

I am quite new to web scraping, would be grateful if anyone can help me with the same.

Upvotes: 0

Views: 1289

Answers (1)

Adán Escobar
Adán Escobar

Reputation: 4723

If you want to scrape a page, first you must realize how it is works, inspecting with dev tools and analyze the network calls, and then you have to emulate the call that you need.

In this case, the page is using ajax to get reviews in paginate way

you have to call:

https://www.imdb.com/title/tt6654210/reviews/_ajax?ref_=undefined&paginationKey=g4wp7dreqyzd4zql7kvh3obyrtum6az4y4hhzo5ziwr26fbyhvrl4ty4o4yvzmjkcrxndtvd7hmf6y6yefcmwoi6hkwovare

the pagination key is provided in the page by the following tag:

<div class="load-more-data" data-key="g4wp7dreqyzd4zql7kvh3obyrtum6az4y4hhzo5ziwr26fbyhvrl4ty4o4yvzmjkcrxndtvd7hmf6y6yefcmwoi6hkwovare" data-ajaxurl="/title/tt6654210/reviews/_ajax">

I hope I have been helpful

Upvotes: 1

Related Questions