Reputation: 409
def get_html(url):
response = urllib.request.urlopen(url)
return response.read()
def parse_main(html):
webpage = BeautifulSoup(html, features="html.parser")
table = webpage.find('table', id='itemList')
for a_tag in table.find_all('a', class_='all'):
parse_movie(get_html('https://www.somerandommovieswebsite.com' + a_tag['href']))
def parse_movie(html):
web_page = BeautifulSoup(html, features="html.parser")
info = web_page.find('h1', class_="moviename")
movies.append(info.text)
def main():
movies = []
parse_main(get_html('https://www.somerandommovieswebsite.com'))
print(movies)
if __name__ == '__main__':
main()
How do I access the movies list (that is defined in main() function) in parse_movie which is nested in parse_main. Can't append anything to the list because of "unresolved referrence 'movies'" error. Using nonlocal didn't help
Upvotes: 1
Views: 81
Reputation: 1362
There are several ways to do it.
First define globally movies.
Second you can just pass a list as a parameter like that.
Since lists are passed by reference and we are appending the list which is define in main function and we don't need to return to the main function.
def parse_main(html,movies):
webpage = BeautifulSoup(html, features="html.parser")
table = webpage.find('table', id='itemList')
for a_tag in table.find_all('a', class_='all'):
parse_movie(get_html('https://www.somerandommovieswebsite.com' + a_tag['href']),movies)
def parse_movie(html,movies):
web_page = BeautifulSoup(html, features="html.parser")
info = web_page.find('h1', class_="moviename")
movies.append(info.text)
def main():
movies = []
parse_main(get_html('https://www.somerandommovieswebsite.com'),movies)
print(movies)
Third approach is to make a list inside a function and return it
def parse_main(html):
webpage = BeautifulSoup(html, features="html.parser")
table = webpage.find('table', id='itemList')
movies = []
for a_tag in table.find_all('a', class_='all'):
movies.append (parse_movie(get_html('https://www.somerandommovieswebsite.com' + a_tag['href'])))
return movies
def parse_movie(html):
web_page = BeautifulSoup(html, features="html.parser")
info = web_page.find('h1', class_="moviename")
return info.text
def main():
movies = parse_main(get_html('https://www.somerandommovieswebsite.com'))
print(movies)
Upvotes: 4
Reputation: 977
Pass the movies
list as an argument and avoid using global variables, in most cases it's better.
The issue was that movies
was a local variable inside ̀parse_movie
, meaning it's a different variable than the one defined in your main
.
I simply passed the ̀movies
variable from the main
function down to the parse_movie
one and added return
statements.
def get_html(url):
response = urllib.request.urlopen(url)
return response.read()
def parse_main(html):
movies = []
webpage = BeautifulSoup(html, features="html.parser")
table = webpage.find('table', id='itemList')
for a_tag in table.find_all('a', class_='all'):
movies.append(parse_movie(get_html('https://www.somerandommovieswebsite.com' + a_tag['href'])))
return movies
def parse_movie(html):
web_page = BeautifulSoup(html, features="html.parser")
info = web_page.find('h1', class_="moviename")
return info.text
def main():
movies = parse_main(get_html('https://www.somerandommovieswebsite.com'))
print(movies)
if __name__ == '__main__':
main()
Upvotes: 1
Reputation: 4330
The easiest approach would be using a global variable. But you should avoid using global variables whenever possible. You can change your code something like this and avoid using global variables and passing the variable as parameter.
def get_html(url):
response = urllib.request.urlopen(url)
return response.read()
def parse_main(html):
parse_movies = []
webpage = BeautifulSoup(html, features="html.parser")
table = webpage.find('table', id='itemList')
for a_tag in table.find_all('a', class_='all'):
parse_movies.append(parse_movie(get_html('https://www.somerandommovieswebsite.com' + a_tag['href'])))
return movies
def parse_movie(html):
web_page = BeautifulSoup(html, features="html.parser")
info = web_page.find('h1', class_="moviename")
return info.text
def main():
movies = parse_main(get_html('https://www.somerandommovieswebsite.com'))
print(movies)
if __name__ == '__main__':
main()
Upvotes: 1
Reputation: 27283
I think you should neither use a global variable here nor pass it as an argument:
def get_html(url):
response = urllib.request.urlopen(url)
return response.read()
def parse_main(html):
movies = []
webpage = BeautifulSoup(html, features="html.parser")
table = webpage.find('table', id='itemList')
for a_tag in table.find_all('a', class_='all'):
movies.append(
parse_movie(get_html('https://www.somerandommovieswebsite.com' + a_tag['href']))
)
return movies
def parse_movie(html):
web_page = BeautifulSoup(html, features="html.parser")
info = web_page.find('h1', class_="moviename")
return info.text
def main():
movies = parse_main(get_html('https://www.somerandommovieswebsite.com'))
print(movies)
if __name__ == '__main__':
main()
Upvotes: 4
Reputation: 1672
movies is a local variable inside your main function, so it's normal your function doesn't find it, either make it global (not always a good idea) or pass it as an argument.
Upvotes: 0