user1464296
user1464296

Reputation:

Validate email addresses in Mysql

This query creates a mysql view that captures bad email address formats in one table. So if a row is inserted in that has rtrrg.com as a email it will be recorded in the view. My question is, how do I make the view track more than one table. A second table.

The SQL

CREATE VIEW `invalid_emails` AS 
  select `table_with_email_column`.`email` AS `invalidemail` 
    from `table_with_email_column` 
   where ((locate(_latin1'', ltrim(rtrim(`table_with_email_column`.`email`))) <> 0) 
      or (left(ltrim(`table_with_email_column`.`email`), 1) = _latin1'@') 
      or (right(rtrim(`table_with_email_column`.`email`), 1) = _latin1'.') 
      or ((locate(_latin1'.', `table_with_email_column`.`email`,locate(_latin1'@', `table_with_email_column`.`email`)) -  locate(_latin1'@', `table_with_email_column`.`email`)) <= 1) 
      or ((length(ltrim(rtrim(`table_with_email_column`.`email`))) -  length(replace(ltrim(rtrim(`table_with_email_column`.`email`)), _latin1'@', _latin1''))) <> 1) 
      or (locate(_latin1'.', reverse(ltrim(rtrim(`table_with_email_column`.`email`)))) < 3) 
      or (locate(_latin1'.@', `table_with_email_column`.`email`) <> 0) 
      or (locate(_latin1'..', `table_with_email_column`.`email`) <> 0));

Upvotes: 37

Views: 103221

Answers (9)

Eliocarlos Costa
Eliocarlos Costa

Reputation: 33

Once only regexp suggestted doesn't consider all errors, I complete on my own function. See the comments into code.

DELIMITER $$

CREATE OR REPLACE function fn_DOC_Validar_EMail ( pStrEMail VARCHAR(200) )
RETURNS BIT
BEGIN

declare lIntValid bit;

set lIntValid = 0;

SELECT 1 into lIntValid
   Where pStrEMail REGEXP '^[a-zA-Z0-9][a-zA-Z0-9.!#$%&\'*+-/=?^_`{|}~]*?[a-zA-Z0-9._-]?@[a-zA-Z0-9][a-zA-Z0-9._-]*?[a-zA-Z0-9]?\\.[a-zA-Z]{2,63}$'
   and pStrEMail not like '%[^a-z0-9@._-]%' -- not allow characters differents: a-z 0-9 @ . _ -
   and pStrEMail not like '%@%@%' -- not allow  two @
   and pStrEMail not like '%.@%' -- not allow  .@
   and pStrEMail not like '%..%' -- not allow  ..
   and pStrEMail not like '%.' -- not allow  . (dot) at end
   and pStrEMail like '%_@_%_.__%' -- not allow short, i.e.,  [email protected]
   and pStrEMail not LIKE '%^%' -- not allow character ^
   and pStrEMail not LIKE '%\%%' -- not allow character %
;

return lIntValid ;
  
END;

$$

Use the samples bellow to Test your function.

Select Sequencial, email, fn_DOC_Validar_EMail(EMail) from TEMP_Emails;

Create table TEMP_Emails
(
  Sequencial int,
  Email varchar(200)
);

-- invalids
insert into TEMP_Emails values (1, '@teste.com'); -- Start with @
insert into TEMP_Emails values (2, 'josue@teste'); -- with out domain
insert into TEMP_Emails values (3, 'jo ue@teste'); -- espace
insert into TEMP_Emails values (4, 'jo"ue@teste'); -- quotes
insert into TEMP_Emails values (5, 'jo$ue@teste'); -- special Character
insert into TEMP_Emails values (6, 'josue^[email protected]'); -- special Character
insert into TEMP_Emails values (7, 'josue][email protected]'); -- special Character
insert into TEMP_Emails values (8, 'josue%[email protected]'); -- special Character
insert into TEMP_Emails values (9, '[email protected]'); -- @. 
insert into TEMP_Emails values (10, 'josue@@teste.com'); -- 2 x @
insert into TEMP_Emails values (11, 'josue@[email protected]'); -- 2 x @
insert into TEMP_Emails values (12, '[email protected]'); -- .@ 
insert into TEMP_Emails values (13, '[email protected]'); -- .. 
insert into TEMP_Emails values (14, '[email protected].'); -- . at final
 
 
--OK: Valids
insert into TEMP_Emails values (101, '[email protected]');
insert into TEMP_Emails values (102, '[email protected]');
insert into TEMP_Emails values (103, '[email protected]');
insert into TEMP_Emails values (104, '[email protected]');
insert into TEMP_Emails values (105, '[email protected]');
insert into TEMP_Emails values (106, '[email protected]');
insert into TEMP_Emails values (107, '[email protected]');
insert into TEMP_Emails values (108, '[email protected]');
insert into TEMP_Emails values (109, '[email protected]');
insert into TEMP_Emails values (110, '[email protected]');

Select Sequencial, email, fn_DOC_Validar_EMail(EMail) from TEMP_Emails;

Upvotes: 1

JEX725
JEX725

Reputation: 71

My solution for finding invalid emails:

SELECT * FROM `tbl_email` WHERE `email` NOT REGEXP '^[a-zA-Z0-9]+[a-zA-Z0-9._-]*@[a-zA-Z0-9]+[a-zA-Z0-9._-]*\.[a-zA-Z0-9]{2,63}$';

Upvotes: 0

Praveen Kumar Purushothaman
Praveen Kumar Purushothaman

Reputation: 167172

You can use a pure SELECT to validate Email Addresses:

SELECT * FROM `users` WHERE `email` NOT REGEXP '^[^@]+@[^@]+\.[^@]{2,}$';

And now for your question of tracking multiple tables, you can use comma seperated table names right?

SELECT * FROM `users`, `customers`, `clients`
         WHERE `email` NOT REGEXP "^[a-zA-Z0-9][a-zA-Z0-9.!#$%&'*+-/=?^_`{|}~]*?[a-zA-Z0-9._-]?@[a-zA-Z0-9][a-zA-Z0-9._-]*?[a-zA-Z0-9]?\\.[a-zA-Z]{2,63}$";

Upvotes: 98

Jacob Thomason
Jacob Thomason

Reputation: 3401

With MySQL 9 you can create a check constraint now that will validate all data stored in the email address column. Here is an ALTER TABLE statement:

ALTER TABLE `user` 
ADD CONSTRAINT `user.email_validation` 
    CHECK (`email` REGEXP "^[a-zA-Z0-9][a-zA-Z0-9.!#$%&'*+-/=?^_`{|}~]*?[a-zA-Z0-9._-]?@[a-zA-Z0-9][a-zA-Z0-9._-]*?[a-zA-Z0-9]?\\.[a-zA-Z]{2,63}$");

Upvotes: 1

Gilles GNANAGBE
Gilles GNANAGBE

Reputation: 41

select EmailAddress from FindInvalidEmailAddressDemo
   -> where EmailAddress NOT LIKE '%_@_%._%';

Upvotes: 0

Bud Damyanov
Bud Damyanov

Reputation: 31829

Simple SELECT statement is sufficient, for example:

 SELECT * FROM user WHERE email NOT 
 REGEXP '^[a-zA-Z0-9][+a-zA-Z0-9._-]*@[a-zA-Z0-9][a-zA-Z0-9._-]*[a-zA-Z0-9]*\\.[a-zA-Z]{2,4}$'

This query handles the Gmail addresses with + sign and addresses where the host is a single letter.

Upvotes: 8

Amrish Prajapati
Amrish Prajapati

Reputation: 787

SELECT
    *
FROM
    users
WHERE
    email NOT REGEXP ‘ ^[ a - zA - Z0 - 9 ][ a - zA - Z0 - 9._ -]*[ a - zA - Z0 - 9 ]@[ a - zA - Z0 - 9 ][ a - zA - Z0 - 9._ -]*[ a - zA - Z0 - 9 ]\.[ a - zA - Z ]{ 2,
    63 } $’

Upvotes: 0

Sachin Parse
Sachin Parse

Reputation: 1323

For the proper email validation, you can use this regex as bellow:

SELECT
    *
FROM
    `school`
WHERE
    `email` NOT REGEXP '^[a-zA-Z0-9][a-zA-Z0-9._-]*[a-zA-Z0-9._-]@[a-zA-Z0-9][a-zA-Z0-9._-]*[a-zA-Z0-9]\\.[a-zA-Z]{2,63}$';

Upvotes: 18

hol
hol

Reputation: 8423

You can use a UNION in the VIEW but then you have to repeat all the WHERE statement which gives you redundant code. So you would make a helper VIEW that makes you a UNION and then apply the WHERE clause.

Demo here: SQL Fiddle Demo.

That would apply to your SQL somehow like this (untested);

CREATE VIEW `invalid_emails_helper` AS
  select `table_with_email_column`.`email` AS `invalidemail` 
    from `table_with_email_column` 
union
  select `table_with_email_column`.`email` 
    from `second_table_with_email_column` 

CREATE VIEW `invalid_emails` AS 
  select `invalidemail` as `email`
    from `invalid_emails_helper` as `table_with_email_column` 
   where ((locate(_latin1'', ltrim(rtrim(`table_with_email_column`.`email`))) <> 0) 
      or (left(ltrim(`table_with_email_column`.`email`), 1) = _latin1'@') 
      or (right(rtrim(`table_with_email_column`.`email`), 1) = _latin1'.') 
      or ((locate(_latin1'.', `table_with_email_column`.`email`,locate(_latin1'@', `table_with_email_column`.`email`)) -  locate(_latin1'@', `table_with_email_column`.`email`)) <= 1) 
      or ((length(ltrim(rtrim(`table_with_email_column`.`email`))) -  length(replace(ltrim(rtrim(`table_with_email_column`.`email`)), _latin1'@', _latin1''))) <> 1) 
      or (locate(_latin1'.', reverse(ltrim(rtrim(`table_with_email_column`.`email`)))) < 3) 
      or (locate(_latin1'.@', `table_with_email_column`.`email`) <> 0) 
      or (locate(_latin1'..', `table_with_email_column`.`email`) <> 0));

And yes, the query to check the e-mail address using a regex as can easily found everywhere in the internet simplifies it further.

Upvotes: 1

Related Questions