Reputation: 9861
Looking on SO there are various approaches to this problem, however the recommended solution for instance does not deal with \"Last, First\" " and the suggestion posted by richard in that post is missing the code to SetUpTextFieldParser()
I have the following list of email addresses as a string:
string str = "Last, First <[email protected]>, [email protected], First Last <[email protected]>, \"First Last\" <[email protected]>, \"Last, First\" <[email protected]>";
Current code does a:
str.Split(",");
which produces an incorrect list because of the comma in:
"Last, First"
Anyone got something elegant here to share so that I end up with an array of strings in the form:
Last, First <[email protected]>
[email protected]
First Last <[email protected]>
"First Last" <[email protected]>
"Last, First" <[email protected]>
EDIT - SOLUTION
I ended up using Yacoub Massad's solution as it was simple (regular expressions would be hard to maintain in my dev group as not everyone understands them). Below is the code (Fiddle) with some additions and simplistic testing to make sure all was well:
_
using System;
using System.Collections.Generic;
using System.Net.Mail;
public class Program
{
public static void Main()
{
//https://msdn.microsoft.com/en-us/library/system.net.mail.mailaddress(v=vs.110).aspx
//Some esoteric "comment" formats as well as a trailing comma in case someone did not tidy up
string emails = "Last, First <[email protected]>, [email protected], First Last <[email protected]>, \"First Last\" <[email protected]>, \"Last, First\" <[email protected]>, (comment)\"First, Last\"(comment)<(comment)joe(comment)@(comment)there.com(comment)>(comment),";
List<string> result = new List<string>();
Console.WriteLine("LOOP");
while (true)
{
int position_of_at = emails.IndexOf("@");
if (position_of_at == -1)
{
break;
}
int position_of_comma = emails.IndexOf(",", position_of_at);
if (position_of_comma == -1)
{
result.Add(emails);
break;
}
string email = emails.Substring(0, position_of_comma);
result.Add(email);
emails = emails.Substring(position_of_comma + 1);
}
Console.WriteLine("/LOOP");
//Do some very basic validation of above code
var i = 1;
if (result.Count == 6)
Console.WriteLine("SUCCESS: " + result.Count);
else
Console.WriteLine("FAILURE: " + result.Count);
foreach (string emailAddress in result)
{
Console.WriteLine("==== " + i.ToString());
Console.WriteLine(emailAddress);
Console.WriteLine("/====");
MailAddress mailAddress = new MailAddress(emailAddress);
Console.WriteLine(mailAddress.DisplayName);
Console.WriteLine("---- " + i.ToString());
i++;
}
}
}
Upvotes: 5
Views: 4541
Reputation: 407
Try
UserEmails?.Split(';',',',' ','\n','\t').Where(x => !string.IsNullOrWhiteSpace(x)).ToList();
Upvotes: 0
Reputation: 9095
Here's a version that handles a few more edge cases and has fewer allocations:
public static List<string> ExtractEmailAddresses(string text)
{
var items = new List<string>();
if (String.IsNullOrEmpty(text))
{
return items;
}
int start = 0;
bool foundAt = false;
int comment = 0;
for (int i = start; i < text.Length; i++)
{
switch (text[i])
{
case '@':
if (comment == 0) { foundAt = true; }
break;
case '(':
comment++;
break;
case ')':
comment--;
break;
case ',':
HandleLastBlock(i);
break;
}
}
HandleLastBlock(text.Length);
return items;
void HandleLastBlock(int end)
{
if (comment == 0 && foundAt && start < end - 1)
{
var email = new System.Net.Mail.MailAddress(text.Substring(start, end - start));
items.Add(email.Address);
start = end + 1;
foundAt = false;
}
}
}
Upvotes: 0
Reputation: 11228
You can use Regex.Split
with @"(?<=@\S*)\s+
- it splits on a space (or spaces) preceded by a word containing @
:
string str = "Last, First <[email protected]>, [email protected], First Last <[email protected]>, \"First Last\" <[email protected]>, \"Last, First\" <[email protected]>";
string[] arr = Regex.Split(str, @"(?<=@\S*)\s+");
foreach (var s in arr)
Console.WriteLine(s);
output:
Last, First <[email protected]>,
[email protected],
First Last <[email protected]>,
"First Last" <[email protected]>,
"Last, First" <[email protected]>
Upvotes: 0
Reputation: 3326
shortest method would be:
string str = "Last, First <[email protected]>, [email protected], First Last <[email protected]>, \"First Last\" <[email protected]>, \"Last, First\" <[email protected]>";
string[] separators = new string[] { "com>,","com,","com>","com"};
var outputEmail = str.Split(separators,StringSplitOptions.RemoveEmptyEntries).Where(s=>s.Contains("@")).Select(s=>{return s.Contains('<') ? (s+"com>").Trim() : (s+"com").Trim();});
foreach (var email in outputEmail)
{
MessageBox.Show(email);
}
Upvotes: 0
Reputation: 26301
Here is a nice and elegant short method that will do what you ask using a regular expression:
private IEnumerable<string> GetEmails(string input)
{
if (String.IsNullOrWhiteSpace(input)) yield break;
MatchCollection matches = Regex.Matches(input, @"[^\s<]+@[^\s,>]+");
foreach (Match match in matches) yield return match.Value;
}
You would call it like this:
string str = "Last, First <[email protected]>, [email protected], First Last <[email protected]>, \"First Last\" <[email protected]>, \"Last, First\" <[email protected]>";
IEnumerable<string> emails = GetEmails(str);
Please note that this regular expression does not validate the email addresses, for instance, the email 1@h
will be considered valid and you will get it as a match.
Creating such a regex validator would be a difficult job and probably not the best option.
For retrieving purposes, I think it is the ideal tool.
Upvotes: 1
Reputation: 459
Not exactly elegant, but try this:
private static IEnumerable<string> GetEntries(string str)
{
List<string> entries = new List<string>();
StringBuilder entry = new StringBuilder();
while (str.Length > 0)
{
char ch = str[0];
//If the first character on the string is a comma, and the entry already contains na '@'
//Add this entry to the entries list and clear the temporary entry item.
if (ch == ',' && entry.ToString().Contains("@"))
{
entries.Add(entry.ToString());
entry.Clear();
}
//Just add the chacacter to the temporary entry item, otherwise.
else
{
entry.Append(ch);
}
str = str.Remove(0, 1);
}
//Add the last entry, which is still in the buffer because it doesn't end with a ',' character.
entries.Add(entry.ToString());
return entries;
}
It will Split entries by comma, but only those entries which contains an '@' character before the ',' character.
You would call it like this:
string str = "Last, First <[email protected]>, [email protected], First Last <[email protected]>, \"First Last\" <[email protected]>, \"Last, First\" <[email protected]>";
var entries = GetEntries(str);
Upvotes: 0
Reputation: 27871
Try this:
public List<string> ExtractEmails(string emails)
{
List<string> result = new List<string>();
while (true)
{
int position_of_at = emails.IndexOf("@");
if (position_of_at == -1)
{
break;
}
int position_of_comma = emails.IndexOf(",", position_of_at);
if (position_of_comma == -1)
{
result.Add(emails);
break;
}
string email = emails.Substring(0, position_of_comma);
result.Add(email);
emails = emails.Substring(position_of_comma + 1);
}
return result;
}
It assumes that all emails are going to contain the @
character.
It works by considering only the commas that appear after the @
character as splitting commas, other commas are considered part of the email.
Upvotes: 4