Reputation: 3637
I've used the following 2 regex, that I have found here:
for 10 digit phone number with space before and after it: @"(?<!\d)\d{10}(?!\d)"
For email:
@"(([\w-]+\.)+[\w-]+|([a-zA-Z]{1}|[\w-]{2,}))@" + @"((([0-1]?[0-9]{1,2}|25[0-5]|2[0-4][0-9])\.([0-1]?
[0-9]{1,2}|25[0-5]|2[0-4][0-9])." + @"([0-1]?[0-9]{1,2}|25[0-5]|2[0-4][0-9]).([0-1]?[0-9]{1,2}|25[0-5]|2[0-4][0-9])){1}|" + @"([a-zA-Z]+[\w-]+.)+[a-zA-Z]{2,4})";
the email regex works fine for normal and correct email addresses,
but my input text isn't actually normal and correct, sau I have to introduce another particularity for it, I would also like it to parse:
myname@this -email.com *or* myname@this- email.com *or* myname@this - email.com
And the 10 digit number phone regex only extracts the last 9 digits of the number, with the phone number starting with 0 :
for nr = 0123456789 I only get : 123456789 , the phone number is found in text like:
"some text here 0123456789 and some more here"
.
I've also found that the 10 digit number may be found in this form : 012/3456789
Upvotes: 0
Views: 105
Reputation: 9644
I don't believe you need to use look ahead to match your phone number. You could do something like " (\d[^0-9a-zA-Z]*?){9}\d "
to match for numbers exactly 10 digits long, with any kind of junk between the numbers (except a-z letters). Careful though, it would also match stuff like " 012 à 3456789 "
. If the single slash is the only possible separator, you can use " (\d/?){9}\d "
: no false positive but you wont match unexpected separators.
Regarding the email matter, the official complete regex is pretty long. Gusdor's link (http://www.regular-expressions.info/email.html) gives some context and a simple one that should match pretty much every email address actually in use. Tweaking it a little bit to match your case (this -domain.com
or this- domain.org
would be matched), you could use something like this:
[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*@(?:[a-z0-9](?:[a-z0-9]*( *?- *)?[a-z0-9]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?
The custom part here is the ( *?- *)?
added to the right place.
Upvotes: 1