Craig
Craig

Reputation: 18694

Split name into fields?

Is there an amazing RegEx or method in C# that might achieve this for me?

Someone types a string into a 'Full name' field, and I need to break it into: Title Firstname Middle Surname Suffix

But the user can type "John Smith", so it needs to put John into First Name, and Smith into Surname. A person can type Mr John Smith (I have a list of known titles and suffixes), so if the first string is a title, it goes into Title field.

A perfect example would be:

Mr John Campbell Smith Jr

But, they could have:

Mr and Mrs John and Mary Smith

So the title would be Mr and Mrs, the firstname would be John and Mary, and the surname is Smith (They can use either "And" or "&" as a joiner)

I'm guiessing this is too complex for regex, but I was hoping someone may have an idea?

Upvotes: 4

Views: 1609

Answers (1)

Mike Perrenoud
Mike Perrenoud

Reputation: 67898

Alright, here is a program that I think will do the job for you. You may of course have to make some modifications because I made some assumptions based on your question, but this should certainly get you started in the right direction.

Some of those assumptions are as follows:

  1. There is no punctuation in the name provided to the function (e.g. Jr. with the period).
  2. You must have a first and last name, but the titles, middle name, and suffix are optional.
  3. The only join operators are and and & just as stated in the question.
  4. The name is in this format {titles} {first name} {middle name} {last name} {suffix}.

I threw a lot of different names at it, but there are certainly more possibilities, I didn't spend any more than 30 minutes on this so it's not fully tested.

class Program
{
    static List<string> _titles = new List<string> { "Mr", "Mrs", "Miss" };
    static List<string> _suffixes = new List<string> { "Jr", "Sr" };

    static void Main(string[] args)
    {
        var nameCombinations = new List<string>
        {
            "Mr and Mrs John and Mary Sue Smith Jr",
            "Mr and Mrs John and Mary Smith Jr",
            "Mr and Mrs John and Mary Sue Smith",
            "Mr and Mrs John and Mary Smith",
            "Mr and Mrs John Smith Jr",
            "Mr and Mrs John Smith",
            "John Smith",
            "John and Mary Smith",
            "John and Mary Smith Jr",
            "Mr John Campbell Smith Jr",
            "Mr John Smith",
            "Mr John Smith Jr",
        };

        foreach (var name in nameCombinations)
        {
            Console.WriteLine(name);

            var breakdown = InterperetName(name);

            Console.WriteLine("    Title(s):       {0}", string.Join(", ", breakdown.Item1));
            Console.WriteLine("    First Name(s):  {0}", string.Join(", ", breakdown.Item2));
            Console.WriteLine("    Middle Name:    {0}", breakdown.Item3);
            Console.WriteLine("    Last Name:      {0}", breakdown.Item4);
            Console.WriteLine("    Suffix:         {0}", breakdown.Item5);

            Console.WriteLine();
        }

        Console.ReadKey();
    }

    static Tuple<List<string>, List<string>, string, string, string> InterperetName(string name)
    {
        var segments = name.Split(" ".ToCharArray(), StringSplitOptions.RemoveEmptyEntries);

        List<string> titles = new List<string>(),
            firstNames = new List<string>();
        string middleName = null, lastName = null, suffix = null;
        int segment = 0;

        for (int i = 0; i < segments.Length; i++)
        {
            var s = segments[i];

            switch (segment)
            {
                case 0:
                    if (_titles.Contains(s))
                    {
                        titles.Add(s);
                        if (segments[i + 1].IsJoiner())
                        {
                            i++;
                            continue;
                        }

                        segment++;
                    }
                    else
                    {
                        segment++;
                        goto case 1;
                    }

                    break;
                case 1:
                    firstNames.Add(s);
                    if (segments[i + 1].IsJoiner())
                    {
                        i++;
                        continue;
                    }

                    segment++;

                    break;
                case 2:
                    if ((i + 1) == segments.Length)
                    {
                        segment++;
                        goto case 3;
                    }
                    else if ((i + 2) == segments.Length && _suffixes.Contains(segments[i + 1]))
                    {
                        segment++;
                        goto case 3;
                    }

                    middleName = s;
                    segment++;

                    break;
                case 3:
                    lastName = s;
                    segment++;

                    break;
                case 4:
                    if (_suffixes.Contains(s))
                    {
                        suffix = s;
                    }

                    segment++;

                    break;
            }
        }

        return new Tuple<List<string>, List<string>, string, string, string>(titles, firstNames, middleName, lastName, suffix);
    }
}

internal static class Extensions
{
    internal static bool IsJoiner(this string s)
    {
        var val = s.ToLower().Trim();
        return val == "and" || val == "&";
    }
}

Upvotes: 6

Related Questions