Reputation: 6944
I want to get e-mail formed texts in a field. I have tried sql below but no luck. See SqlFiddle. Removing ^ and $ from regexp not working too.
WITH TEST_DATA AS (
SELECT '[email protected]' AS EMAIL FROM DUAL UNION ALL
SELECT 'mail [email protected]' FROM DUAL UNION ALL
SELECT 'mail [email protected] sent' FROM DUAL UNION ALL
SELECT '[email protected] sent count 23' FROM DUAL UNION ALL
SELECT 'mail already sent to [email protected] and [email protected]' FROM DUAL UNION ALL
SELECT '[email protected] sent count 23' FROM DUAL
)SELECT REGEXP_SUBSTR(EMAIL,'^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,4}$') MAIL
FROM TEST_DATA;
Expected output for this dataset
[email protected]
[email protected]
[email protected]
[email protected]
[email protected], [email protected]
[email protected]
Any help appreciated.
Upvotes: 3
Views: 9315
Reputation: 10525
If you want to extract multiple mail ids in a single column, you can use REGEXP_REPLACE
function.
Assuming all the ids in your data are valid ones,
REGEXP_REPLACE (EMAIL, '(\w+@\w+\.\w+ ?)|(.)', '\1')
This removes all other text except for mail ids that are separated by at least a space.
You can then remove any trailing spaces and add comma to separate multiple ids.
REPLACE (TRIM (REGEXP_REPLACE (EMAIL, '(\w+@\w+\.\w+ ?)|(.)', '\1')),
' ',
', ')
Example:
WITH TEST_DATA
AS (SELECT '[email protected]' AS EMAIL FROM DUAL
UNION ALL
SELECT 'mail [email protected]' FROM DUAL
UNION ALL
SELECT 'mail [email protected] sent to [email protected] and [email protected]' FROM DUAL
UNION ALL
SELECT '[email protected] sent count 23 and [email protected]' FROM DUAL
UNION ALL
SELECT 'mail already sent to [email protected] and [email protected]' FROM DUAL
UNION ALL
SELECT '[email protected] sent count 23' FROM DUAL)
SELECT REPLACE (TRIM (REGEXP_REPLACE (EMAIL, '(\w+@\w+\.\w+ ?)|(.)', '\1')),
' ',
', ')
MAIL
FROM TEST_DATA;
MAIL
-----------------------------
[email protected]
[email protected]
[email protected], [email protected], [email protected]
[email protected], [email protected]
[email protected], [email protected]
[email protected]
Upvotes: 5
Reputation: 861
You are close! try this
SELECT REGEXP_SUBSTR(EMAIL,'[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,4}') MAIL
edited:
Maybe this helps:
WITH TEST_DATA AS (
SELECT '[email protected]' AS EMAIL FROM DUAL UNION ALL
SELECT 'mail [email protected]' FROM DUAL UNION ALL
SELECT 'mail [email protected] sent' FROM DUAL UNION ALL
SELECT '[email protected] sent count 23' FROM DUAL UNION ALL
SELECT 'mail already sent to [email protected] and [email protected]' FROM DUAL UNION ALL
SELECT '[email protected] sent count 23' FROM DUAL
)SELECT REGEXP_SUBSTR(EMAIL,'[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,4}') MAIL,
REGEXP_SUBSTR(EMAIL,'[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,4}',1,2) MAIL2
FROM TEST_DATA
I don't see a way to report 'n' number of matches. I also do not realize how to insert a comma and output into one column. I would bet that if possible, the query will be become quite complex with multiple inner selects/finds/replaces occuring. A better solution may be to return the original result to another language for parsing or to perform such parsing using pl/sql.
Another edit:
Here is what I meant regarding the inner selects. Exact solution to the asked question :-)
select CASE WHEN MAIL2 is not null THEN mail||', '||mail2 ELSE mail END as mail
from (
WITH TEST_DATA AS (
SELECT '[email protected]' AS EMAIL FROM DUAL UNION ALL
SELECT 'mail [email protected]' FROM DUAL UNION ALL
SELECT 'mail [email protected] sent' FROM DUAL UNION ALL
SELECT '[email protected] sent count 23' FROM DUAL UNION ALL
SELECT 'mail already sent to [email protected] and [email protected]' FROM DUAL UNION ALL
SELECT '[email protected] sent count 23' FROM DUAL
)SELECT REGEXP_SUBSTR(EMAIL,'[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,4}') MAIL,
REGEXP_SUBSTR(EMAIL,'[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,4}',1,2) MAIL2
FROM TEST_DATA
)
I also stubled upon this Oracle articale which discusses e-mail matching at point 8. It might be worth a peek. http://www.orafaq.com/node/2404
Upvotes: 3