Reputation: 21
having a problem with the below - whenever I run it it's only returning with the very first value for movieid and movie name. Was hoping to create something that'd take down every title and id on Netflix.
from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup
my_url ='https://www.netflix.com/browse/genre/1365'
uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()
page_soup = soup(page_html,"html.parser")
containers = page_soup.findAll("li",{"class":"nm-content-horizontal-row-item"})
for container in containers:
title_container = container.findAll("a",{"class":"nm-collections-title nm-collections-link"})
title_container = title_container[0].text
movieid = container.findAll("a",{"class":"nm-collections-title nm-collections-link"})
movieid = movieid[0].attrs['href']
print("Movie Name: " + title_container, "\n")
print("Movie ID: " , movieid, "\n")
Upvotes: 0
Views: 54
Reputation: 2977
You have to move your print statements into your loop so that they get printed with every iteration
for container in containers:
title_container = container.findAll("a",{"class":"nm-collections-title nm-collections-link"})
title_container = title_container[0].text
movieid = container.findAll("a",{"class":"nm-collections-title nm-collections-link"})
movieid = movieid[0].attrs['href']
print("Movie Name: " + title_container, "\n") # move them in
print("Movie ID: " , movieid, "\n")
Upvotes: 3