ComplexData
ComplexData

Reputation: 1093

Count based on other csv file

I have a dataframe df with two columns called 'MovieName' and 'Actors'. It looks like:

MovieName    Actors
lights out   Maria Bello
legend       Tom Hardy*Emily Browning*Christopher Eccleston*David Thewlis

Please note that different actor names are separated by '*'. I have another csv file called gender.csv which has the gender of all actors based on their first names. gender.csv looks like -

ActorName    Gender
Tom          male
Emily        female
Christopher  male

I want to add two columns in my dataframe 'female_actors' and 'male_actors' which contains the count of female and male actors in that particular movie respectively.

How do I achieve this task using both df and gender.csv in pandas?

Please note that -

  1. If particular name isn't present in gender.csv, don't count it in the total.
  2. If there is just one actor in a movie, and it isn't present in gender.csv, then it's count should be zero.

Result of above example should be -

MovieName    Actors      male_actors    female_actors
lights out   Maria Bello    0              0
legend       Tom Hardy*Emily Browning*Christopher Eccleston*David Thewlis    2    1

Upvotes: 1

Views: 88

Answers (1)

Quazi Marufur Rahman
Quazi Marufur Rahman

Reputation: 2623

import pandas as pd

df1 = pd.DataFrame({'MovieName': ['lights out', 'legend'], 'Actors':['Maria Bello', 'Tom Hardy*Emily Browning*Christopher Eccleston*David Thewlis']})
df2 = pd.DataFrame({'ActorName': ['Tom', 'Emily', 'Christopher'], 'Gender':['male', 'female', 'male']})


def func(actors, gender):
    actors = [act.split()[0] for act in actors.split('*')]      
    n_gender = df2.Gender[df2.Gender==gender][df2.ActorName.isin(actors)].count()
    return n_gender

df1['male_actors'] = df1.Actors.apply(lambda x: func(x, 'male'))
df1['female_actors'] = df1.Actors.apply(lambda x: func(x, 'female'))

df1.to_csv('res.csv', index=False)
print df1

Output

Actors,MovieName,male_actors,female_actors
Maria Bello,lights out,0,0
Tom Hardy*Emily Browning*Christopher Eccleston*David Thewlis,legend,2,1

Upvotes: 3

Related Questions