Reputation: 403
I am trying to generate a pandas dataset comprising person data. I am employing Python's Faker library. Is there a way to generate a valid email address using the first name and last name?
import pandas as pd
import numpy as np
import os
import random
from faker import Faker
def faker_categorical(num=1, seed=None):
np.random.seed(seed)
fake.seed_instance(seed)
output = []
for x in range(num):
gender = np.random.choice(["M", "F"], p=[0.5, 0.5])
output.append(
{
"First name": fake.first_name_male() if gender=="M" else
fake.first_name_female(),
"Last name": fake.last_name(),
"E-mail": fake.ascii_email(),
})
return output
Upvotes: 4
Views: 17773
Reputation: 3616
I'll show an alternative to the accepted answer if you need to build more complex data. This solution relies on the use of data provider that are part of the Faker library. You define a new Provider
and then you add to your instance of Faker()
. You can then call this generator where ever you need.
from faker.providers import BaseProvider
class CustomProvider(BaseProvider):
__provider__ = "personalia"
def personalia(self):
gender = self.random_element(["F", "M"])
first_name = self.generator.first_name_male() if gender == "M" else self.generator.first_name_female()
last_name = self.generator.last_name()
email_address = f"{first_name.lower()}.{last_name.lower()}@{self.generator.domain_name()}"
return {
"First name": first_name,
"Last Name": last_name,
"E-mail": email_address
}
fake = Faker()
fake.add_provider(CustomProvider)
personalia = fake.personalia()
print(personalia)
The output should look like this:
{
'First name': 'Olivia',
'Last Name': 'Cook',
'E-mail': '[email protected]'
}
Of course this is just an simple example based on the code you have provided. ;)
Upvotes: 1
Reputation: 2286
You can use Faker's domain_name
method and string formatting alongside the already generated values:
first_name = fake.first_name_male() if gender =="M" else fake.first_name_female()
last_name = fake.last_name()
output.append(
{
"First name": first_name,
"Last Name": last_name,
"E-mail": f"{first_name}.{last_name}@{fake.domain_name()}"
}
)
On a more complete approach, you could add factoryboy to the mix:
from factory import DictFactory, LazyAttribute
from factory.fuzzy import FuzzyChoice
from factory import Faker
class PersonDataFactory(DictFactory):
first = LazyAttribute(lambda obj: fake.first_name_male() if obj._gender == "M" else fake.first_name_female())
last = Faker("last_name")
email = LazyAttribute(lambda obj: f"{obj.first}.{obj.last}@{fake.domain_name()}")
_gender = FuzzyChoice(("M", "F"))
class Meta:
exclude = ("_gender",)
rename = {"first": "First Name", "last": "Last Name", "email": "E-mail"}
PersonDataFactory()
which will result in something like:
{'First Name': 'Albert',
'Last Name': 'Martinez',
'E-mail': '[email protected]'}
Upvotes: 6