Pedro Alves
Pedro Alves

Reputation: 1054

Aggregate multiple rows from a List - Python

I'm trying to aggregate multiple values from a list in order to not have duplicates in my final result.

I've this code:

import imdb
ia = imdb.IMDb()
top250 = ia.get_top250_movies()
i = 0;
for topmovie in top250:
    # First, retrieve the movie object using its ID
    movie = ia.get_movie(topmovie.movieID)
    # Print the movie's genres
    for genre in movie['genres']:
        cast = movie.get('cast')
        topActors = 2
        i = i+1;
        for actor in cast[:topActors]:
            if i <= 10:
                print('Movie: ', movie, ' Genre: ', genre, 'Actors: ', actor['name']);
            else:
                break;

And I'm getting this:

Movie:  The Shawshank Redemption  Genre:  Drama Actors:  Tim Robbins
Movie:  The Shawshank Redemption  Genre:  Drama Actors:  Morgan Freeman
Movie:  The Godfather  Genre:  Crime Actors:  Marlon Brando
Movie:  The Godfather  Genre:  Crime Actors:  Al Pacino
Movie:  The Godfather  Genre:  Drama Actors:  Marlon Brando
Movie:  The Godfather  Genre:  Drama Actors:  Al Pacino
Movie:  The Godfather: Part II  Genre:  Crime Actors:  Al Pacino
Movie:  The Godfather: Part II  Genre:  Crime Actors:  Robert Duvall
Movie:  The Godfather: Part II  Genre:  Drama Actors:  Al Pacino
Movie:  The Godfather: Part II  Genre:  Drama Actors:  Robert Duvall
Movie:  The Dark Knight  Genre:  Action Actors:  Christian Bale
Movie:  The Dark Knight  Genre:  Action Actors:  Heath Ledger
Movie:  The Dark Knight  Genre:  Crime Actors:  Christian Bale
Movie:  The Dark Knight  Genre:  Crime Actors:  Heath Ledger
Movie:  The Dark Knight  Genre:  Drama Actors:  Christian Bale
Movie:  The Dark Knight  Genre:  Drama Actors:  Heath Ledger
Movie:  The Dark Knight  Genre:  Thriller Actors:  Christian Bale
Movie:  The Dark Knight  Genre:  Thriller Actors:  Heath Ledger
Movie:  12 Angry Men  Genre:  Crime Actors:  Martin Balsam
Movie:  12 Angry Men  Genre:  Crime Actors:  John Fiedler

It's possible to get the following aggregations?

Movie:  The Shawshank Redemption  Genre:  Drama Actors:  Tim Robbins|Morgan Freeman
Movie:  The Godfather  Genre:  Crime|Drama  Actors:  Marlon Brando|Al Pacino|
Movie:  The Godfather: Part II  Genre:  Crime|Drama  Actors:  Al Pacino|Robert Duvall
Movie:  The Dark Knight  Genre:  Action|Crime|Drama|Thriller   Actors:  Christian Bale|Heath Ledger
Movie:  12 Angry Men  Genre:  Crime Actors:  Martin Balsam|John Fiedler

Can get this using Lists? Or do I need to store all the results in a dataframe and then use a groupby?

thanks!

Upvotes: 0

Views: 144

Answers (1)

nbwoodward
nbwoodward

Reputation: 3156

Try using the str.join() method:

import imdb
ia = imdb.IMDb()
top250 = ia.get_top250_movies()
i = 0;
for topmovie in top250:
    # First, retrieve the movie object using its ID
    movie = ia.get_movie(topmovie.movieID)
    cast = movie.get('cast')
    topActors = 2
    i = i+1;
    actor_names= [actor['name'] for actor in cast[:topActors] ]
    if i <= 10:
         print('Movie: ', movie, ' Genre: ', '|'.join(movie['genres']), 'Actors: ', '|'.join(actor_names));
    else:
         break;

Upvotes: 3

Related Questions