Linking first name and gender in Faker library

Question

I am seeking to generate a fake dataset for my research using the Faker library. I am unable to link gender and first name of the person. Can I expect some help in this regard? The function is given below.

def faker_categorical(num=1, seed=None):
    np.random.seed(seed)
    fake.seed_instance(seed)
    
    output = [
        {
            "gender": np.random.choice(["M", "F"], p=[0.5, 0.5]),
            "GivenName": fake.first_name_male() if "gender"=="M" else fake.first_name_female(),
            "Surname": fake.last_name(),
            "Zipcode": fake.zipcode(),
            "Date of Birth": fake.date_of_birth(),
            "country": np.random.choice(["United Kingdom", "France", "Belgium"]),
        }
        for x in range(num)
    ]
    return output

df = pd.DataFrame(faker_categorical(num=1000))

tripleee · Accepted Answer

Your question is unclear, but I guess what you are looking for is a way to refer to the result from np.random.choice() from two different places in your code. Easy -- assign it to a temporary variable, then refer to that variable from both places.

def faker_categorical(num=1, seed=None):
    np.random.seed(seed)
    fake.seed_instance(seed)
    
    output = []
    for x in range(num):
      gender = np.random.choice(["M", "F"], p=[0.5, 0.5])
      output.append(
        {
            "gender": gender,
            "GivenName": fake.first_name_male() if gender=="M" else fake.first_name_female(),
            "Surname": fake.last_name(),
            "Zipcode": fake.zipcode(),
            "Date of Birth": fake.date_of_birth(),
            "country": np.random.choice(["United Kingdom", "France", "Belgium"]),
        })
    return output

Linking first name and gender in Faker library

Answers (2)

Related Questions