Reputation: 4993
I am trying to use a regular expression to extract the part of an email address between the "@" sign and the "." character. This is how I am currently doing it, but can't get the right results.
company = re.findall('^From:.+@(.*).',line)
Gives me:
['@iupui.edu']
I want to get rid of the .edu
Upvotes: 2
Views: 2268
Reputation: 8303
A simple example would be:
>>> import re
>>> re.findall(".*(?<=\@)(.*?)(?=\.)", "From: [email protected]")
['moo']
>>> re.findall(".*(?<=\@)(.*?)(?=\.)", "From: [email protected]")
['moo-hihihi']
This matches the hostname regardless of the beginning of the line, i.e. it's greedy.
Upvotes: 2
Reputation: 4090
To match a literal .
in your regex, you need to use \.
, so your code should look like this:
company = re.findall('^From:.+@(.*)\.',line)
# ^ this position was wrong
See it live here.
Note that this will always match the last occurrence of .
in your string, because (.*)
is greedy. If you want to match the first occurence, you need to exclude any .
from your capturing group:
company = re.findall('^From:.+@([^\.]*)\.',line)
See a demo.
Upvotes: 3
Reputation: 180391
You could just split and find:
s = " [email protected] I"
s = s.split("@", 1)[-1]
print(s[:s.find(".")])
Or just split if it is not always going to match your string:
s = s.split("@", 1)[-1].split(".", 1)[0]
If it is then find will be the fastest:
i = s.find("@")
s = s[i+1:s.find(".", i)]
Upvotes: 1