Reputation: 47
I'm working on a program that takes in an imdb text file, and outputs the top actors (by movie appearances) based on the user input N.
However, I'm running into an issue where I'm having slots taken up by actors in the same amount of movies, which I need to avoid. Rather, if two actors are in 5 movies, for example, the number 5 should appear and the actors names should be combined , separated by a semicolon.
I've tried multiple workarounds to this and nothing has yet worked. Any suggestions?
if __name__ == "__main__":
imdb_file = raw_input("Enter the name of the IMDB file ==> ").strip()
print imdb_file
N= input('Enter the number of top individuals ==> ')
print N
actors_to_movies = {}
for line in open(imdb_file):
words = line.strip().split('|')
actor = words[0].strip()
movie = words[1].strip()
if not actor in actors_to_movies:
actors_to_movies[actor] = set()
actors_to_movies[actor].add(movie)
movie_list= sorted(list(actors_to_movies[actor]))
#Arranges Dictionary into List of Tuples#
D = [ (x, actors_to_movies[x]) for x in actors_to_movies]
descending = sorted(D, key = lambda x: len(x[1]), reverse=True)
#Prints Tuples in Descending Order N number of times (User Input)#
for i in range(N):
print str(len(descending[i][1]))+':', descending[i][0]
Upvotes: 0
Views: 237
Reputation: 2237
There is a useful method itertools.groupby
It allows you to split list into the groups by some key. Using it you can easily write a function that prints top actors:
import itertools
def print_top_actors(actor_info_list, top=5):
"""
:param: actor_info_list should contain tuples of (actor_name, movie_count)
"""
actor_info_list.sort(key=lambda x: x[1], reverse=True)
for i, (movie_count, actor_iter) in enumerate(itertools.groupby(actor_info_list)):
if i >= top:
break
print movie_count, ';'.join(actor for actor, movie_count in actor_iter)
and example of usage:
>>> print_top_actors(
... [
... ("DiCaprio", 100500),
... ("Pitt", 100500),
... ("foo", 10),
... ("bar", 10),
... ("baz", 10),
... ("qux", 3),
... ("lol", 1)
... ], top = 3)
100500 DiCaprio;Pitt
10 foo;bar;baz
3 qux
Upvotes: 3