aabujamra
aabujamra

Reputation: 4636

Python - Name parsing for email checking

I'm trying to build a script that create different variations of a person's name to test its email. Basically, what I want the script to do is:

To decompose a name I'm using the following:

import nameparser 
name="John May Smith"
name=nameparser.HumanName(name)

parts=[]
for i in name:
    j=[i[0],i]
    parts.append(j)

This way parts gets like this:

[['j', 'john'], ['m', 'may'], ['s', 'smith']]

Note that the list in this case has three sublists, however it could have been 2, 4, 5 or 6.

I created another list called separators:

separators=['.','_']

My question is: What is the best way to mix those lists to create a list of possible email local-parts* as described in the example above? I've been burning my brain to find a way to do it for a few days but haven't been able to.

*Local-part is what comes before the @ (in [email protected], the local part would be "jmaysmith").

Upvotes: 0

Views: 87

Answers (1)

Jonathan von Schroeder
Jonathan von Schroeder

Reputation: 1703

the following code should do what you want

from nameparser import HumanName
from itertools  import product, chain, combinations

def name_combinations(name):
    name=HumanName(name)

    parts=[]
    ret=[]
    for i in name:
        j=[i[0].lower(),i.lower()]
        ret.append(i.lower())
        parts.append(j)

    separators=['','.','_']
    for r in range(2,len(parts)+1):
        for c in combinations(parts,r):
            ret = chain(ret,map(lambda l: l[0].join(l[1:]),product(separators,*c)))
    return ret

print(list(name_combinations(name)))

In your examples I have not seen jms, j.s or js in your examples. If that is intentional feel free to clarify what should be excluded.

For reference: The output is

>>> print(list(name_combinations("John Smith")))
['john', 'smith', 'js', 'jsmith', 'johns', 'johnsmith', 'j.s', 'j.smith', 'john.s', 'john.smith', 'j_s', 'j_smith', 'john_s', 'john_smith']
>>> print(list(name_combinations("John May Smith")))
['john', 'may', 'smith', 'jm', 'jmay', 'johnm', 'johnmay', 'j.m', 'j.may', 'john.m', 'john.may', 'j_m', 'j_may', 'john_m', 'john_may', 'js', 'jsmith', 'johns', 'johnsmith', 'j.s', 'j.smith', 'john.s', 'john.smith', 'j_s', 'j_smith', 'john_s', 'john_smith', 'ms', 'msmith', 'mays', 'maysmith', 'm.s', 'm.smith', 'may.s', 'may.smith', 'm_s', 'm_smith', 'may_s', 'may_smith', 'jms', 'jmsmith', 'jmays', 'jmaysmith', 'johnms', 'johnmsmith', 'johnmays', 'johnmaysmith', 'j.m.s', 'j.m.smith', 'j.may.s', 'j.may.smith', 'john.m.s', 'john.m.smith', 'john.may.s', 'john.may.smith', 'j_m_s', 'j_m_smith', 'j_may_s', 'j_may_smith', 'john_m_s', 'john_m_smith', 'john_may_s', 'john_may_smith']

Upvotes: 1

Related Questions