Reputation: 1187
I got the bash script below for email validation from random sites, it's working fine, but I need to know how is it working?
I would greatly appreciate for clear cut explanation for this.
especially "[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+.[a-zA-Z]{2,4}"
#!/bin/sh
while true; do
read -p "Enter Email ID: " to_recipient
if [[ "$to_recipient" =~ [a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,4} ]]
then
break;
else
echo "Please enter a valid email address"
fi
done
Thanks again!
Upvotes: 0
Views: 2778
Reputation: 1936
Short version is that this is a regex match evaluation =~
. The long story is that you need to learn the grammar of regular expressions to understand it.
Here is a short explanation of the specific regex you present:
In regular expressions, the [
]
delimit 'character classes' They will match any character within the class. Within character class definitions, you can specify ranges of characters using -
. So, in the first one: [a-zA-Z0-9._%+-]
, that is a class of characters which is any lower case letter, any upper case letter, any number, or .
, %
, +
, or -
. Then, the +
outside of that class is a Kleene Plus, which indicates one or more of the previous expression (in this case, the character class). Then next bit is an @
sign, which should be self explanatory. The last two classes are supposed to match a domain name, and they're using alphanumeric and .
and -
in the SLD part, and then in the TLD part they're only allowing 2-4 alpha only characters (the {N,M}
syntax indicates lower and upper bounds on the number of matches of the previous expression). I note here that this won't match the longer TLDs which are perfectly valid nowadays: .shopping
, etc.
To actually match an email address using the full RFC for emails, it's actually more complicated than what you've got here.
For more information look up:
I hope this helps.
Upvotes: 1