DakotaV
DakotaV

Reputation: 31

Parse email info from text file

I have updated this code. Right now this code will get the first name, last name, and email from email addresses in a text file. I just need to add a counter that counts the number of unique domain names! so for example:

[email protected]
[email protected]
[email protected]

would return this:

[email protected]
first name: taco
last name: salad
domain: tacos.com

[email protected]
first name: burrito
last name: fest
domain: burrito.com

[email protected]
first name: a
last name: smith
domain: tacos.com

number of emails found:
3
number of unique domains found:
2

Here is what I have so far:


import re

count = 0
fname = input('Enter a filename: ')

afile = open((fname), "rt")
for email in afile:
  if re.match(r'[\w\.-]+@[\w\.-]+',  email):

    print("Found email:" + email)
    count+=1
    split_email = email.split('@')

    name = str(split_email[0])
    for letter in name:
        if "." not in name: 
            splitname = ""
        else:
            splitname = name.split('.')



    try:
        print("First name:" + splitname[0])
        print("Last name:" + splitname[1])
        print ("Domain:" + split_email[1])
    except:
        print("First name:" + name[0])
        print("First name:" + name[1:])
        print ("Domain:" + split_email[1])




    print("\n")
print ("Number of emails found: ")    
print (count)
input('Press ENTER key to continue: ')

Upvotes: 1

Views: 124

Answers (1)

BrainDead
BrainDead

Reputation: 795

import re

# You can switch this with your file data
example_emails = ['[email protected]', '[email protected]', '[email protected]']

for email in example_emails:
  if re.match(r'[\w\.-]+@[\w\.-]+',  email):
    print("Found email:" + email)
    # Split string on char @
    # Example input:
    # [email protected]
    # Output:
    # ['testUwu', 'gmail.com']
    split_email = email.split('@')
    # Split string on uppercase letters
    credentials = re.findall('[a-zA-Z][^A-Z]*', split_email[0])
    print("First name:" + credentials[0])
    print("Last name:" + credentials[1])
    print ("Domain:" + split_email[1])
    # Newline for nicer output formatting
    print("\n")

Example output:

Found email:[email protected]
First name:First
Last name:Last
Domain:email.com


Found email:[email protected]
First name:F
Last name:Last
Domain:email.com

This example code will work only and only with your 2 email formats.

Note that you should probably use some exception handling in case some other formats slip in, example [email protected] will throw IndexError exception because program expects 2 uppercase words. Also on words with more than 2 uppercase letters the code will ignore all letters past second uppercase.

Those are some of the notes I would like you to be aware of, if you're positive that you only have those 2 formats this should work fine.

Upvotes: 1

Related Questions