Reputation: 1791
I want to get the image src data of all coming soon
movies from this link:-
Fandango.com
This is the code:-
def poster(genre):
poster_link = []
request = requests.get(http://www.fandango.com/moviescomingsoon?GenreFilter=genre)
content = request.content
soup = BeautifulSoup(content, "html.parser")
soup2 = soup.find('div', {'class':'movie-ls-group'})
elements = soup2.find_all('img')
for element in elements:
poster_link.append(element.get('src'))
return poster_link
When I'm printing the poster_link array then it's giving me None
instead of image source.
Upvotes: 1
Views: 2936
Reputation: 547
James's answer is great but I noticed it grabs more than the images for that particular section - it grabs the 'New + Coming Soon' section for the bottom of the page too, which seems to be outside the scope of the genre and appears on other pages. This code restricts the image grab to just the genre-specific coming soon section.
def poster(genre):
poster_link = []
request = requests.get('http://www.fandango.com/moviescomingsoon?GenreFilter=' + genre)
content = request.content
soup = BeautifulSoup(content, "html.parser")
comingsoon = soup.find_all('div', {'class':'movie-ls-group'})
movies = comingsoon[0].find_all('img', {'class':'visual-thumb'})
for movie in movies:
poster_link.append(movie.get('data-src'))
return poster_link
print (poster('Horror'))
You might also want to filter out the 'emptysource.jpg' images in your poster_link
array before returning it, as they look like empty placeholders for movies without poster images.
Upvotes: 1
Reputation: 36608
Try this. It shortcuts the subsetting and grabs all of the images that have the proper class.
def poster(genre):
poster_link = []
request = requests.get('http://www.fandango.com/moviescomingsoon?GenreFilter=%s' %genre)
content = request.content
soup = BeautifulSoup(content, "html.parser")
imgs = soup.find_all('img', {'class': 'visual-thumb'})
for img in imgs:
poster_link.append(img.get('data-src'))
return poster_link
Upvotes: 1