Reputation:
I have data set
first_name ajay amit raj mona sema
and I want to identify the there gender from there first name with a new column Gender so I used this code
!pip install gender_guesser
!pip install xlrd
!pip install openpyxl
import pandas as pd
import gender_guesser.detector as gender
df=pd.read_excel('adarsh.xlsx')
gd = gender.Detector()
df['Gender'] = df['first_name'].map(lambda x: gd.get_gender(x))
df['Gender']
output: 0 male
1 unknown
2 unknown
3 unknown
4 unknown
Name: Gender, dtype: object
how can get the complete output of gender
Upvotes: 2
Views: 8496
Reputation: 1
Possible error is due to Capitalization, because by default gender-guesser is case sensitive. Try setting it off and then search. Check out the documentation https://pypi.org/project/gender-guesser/
>>> d = gender.Detector(case_sensitive=False)
>>> print(d.get_gender(u"sally"))
female
>>> print(d.get_gender(u"Sally"))
female
Upvotes: 0
Reputation: 79
in general, gender_guesser.detector requires properly capitalized names.
thus, just add an additional step to capitalize the names before passing them into the guesser.
df['Gender'] = df['first_name'].apply(str.capitalize).map(lambda x: gd.get_gender(x))
Upvotes: 1
Reputation: 389
unknown
implies that the package cannot identify the gender from the names you have inserted.
For the first name, the gender has been identified.
You should have a look into the names that have been catered for in the gender-guesser
Python library.
Upvotes: 0