Reputation: 1
I have a .csv file with 1000 movies (https://github.com/LearnDataSci/articles/blob/master/Python%20Pandas%20Tutorial%20A%20Complete%20Introduction%20for%20Beginners/IMDB-Movie-Data.csv) and have been set the task of getting IMDb URLs for all 1000 movies.
I've tried using Cinemagoer, but I don't particularly want to plug in all 1000 titles by hand. Is there a more efficient way?
Upvotes: -2
Views: 326
Reputation: 1
You can parse the table and use the 'title' column for Cinemagoer queries. Then you can get ID and just form an URL.
import requests
import csv
from imdb import Cinemagoer
from io import StringIO
TABLE_URL = 'https://github.com/LearnDataSci/articles/raw/master/Python%20Pandas%20Tutorial%20A%20Complete%20Introduction%20for%20Beginners/IMDB-Movie-Data.csv'
def get_table(url):
r = requests.get(url, stream=True)
return StringIO(r.content.decode())
def search_movie(interface, name):
movies = interface.search_movie(name)
return movies[0] if len(movies) > 0 else None
def form_url(movie):
return f'http://imdb.com/title/tt{movie.movieID}'
def main():
ia = Cinemagoer()
reader_obj = csv.reader(
get_table(TABLE_URL)
)
# Skip column names
next(reader_obj)
for row in reader_obj:
# 1 - is column with names
print(
form_url(
search_movie(ia, row[1])
)
)
main()
Upvotes: 0