hsandhar
hsandhar

Reputation: 19

How to deal with text input containing a last name with a space or a first name [space] last name combination

I'm dealing with a problem that I can't wrap my head around and could use your help and expertise.

I have a textbox that allows the user to search for another user by a combination of name criterias listed below:

Issue: There are quite a few users who have a space in their last name, if someone searches for them, they may only enter "de la".

Now in this scenario, since there is a space between the words, the system will assume that the search criteria is first name starts with "de" and last name with "la". The system will work as expected if the user typed "de la," because now the input contains a comma, and the system will know for sure that this search is for a last name but I have to assume that not everyone will enter a comma at the end.

However the user probably intended only to search for someone with last name starting with "de la".

Current options I have a few options in mind and could use your help in deciding which one would you recommend. And PLEASE, feel free to add your suggestions.

I'd appreciate any type of feedback you can shed on this issue; your experiences, best practices, or the best one, some code snippets of something you've worked with related to this scenario.

Application background: Its ASP.NET (4.0) WebAPI service written in C#; its consumed by a client sitting on a different server.

Upvotes: 0

Views: 2408

Answers (2)

ManOVision
ManOVision

Reputation: 1893

I've used this technique for a number of years and I like it.

Lose the comma, no one will use it. If there is not a space, search for first OR last. If there is a space, search for first AND last. This code works very well for partial name searches, i.e. "J S" finds Jane Smith and John Smith. "John" will find "John Smith" and "Anne Johnson". This should give you a pretty good starting point to get as fancy as you want with your supported queries.

public IEnumerable<People> Search(string query, int maxResults = 20)
{
    if (string.IsNullOrWhiteSpace(query))
    {
        return new List<People>();
    }

    IEnumerable<People> results;

    var split = query.Split(new[] { ' ' }, StringSplitOptions.RemoveEmptyEntries);

    if (split.Length > 1)
    {
        var firstName = split[0];
        var lastName = string.Join(" ", split.Skip(1));

        results = PeopleRepository.Where(x => 
            x.FirstName.StartsWith(firstName, StringComparison.OrdinalIgnoreCase) &&
            x.LastName.StartsWith(lastName, StringComparison.OrdinalIgnoreCase));
    }
    else
    {
        var search = split[0];
        results = PeopleRepository.Where(x => 
            x.FirstName.StartsWith(search, StringComparison.OrdinalIgnoreCase) ||
            x.LastName.StartsWith(search, StringComparison.OrdinalIgnoreCase));
    }

    return results.Take(maxResults);
}

Upvotes: 1

Mat&#237;as Fidemraizer
Mat&#237;as Fidemraizer

Reputation: 64943

Maybe the point is that you should index your user data in order to look for it efficiently.

For example, you should index first and last names without caring about if they're first or last names. You want to search people, why end-user should care about search term order?

The whole index can store user ids on sets specialized by names (either first or last names). If user ids are integers, it would be something like this:

John => 12, 19, 1929, 349, 1, 29
Smith => 12, 349, 11, 4
Matias => 931, 45
Fidemraizer => 931

This way user inputs whatever and you don't care anymore about ordering: if user types "John", you will show all users where their ids are in the John set. If they type both John Smith, you'll need to intersect both John and Smith sets to find out which user ids are in both sets, and so on.

I don't know what database technology you're currently using, but both SQL and NoSQL products can be a good store for this, but NoSQL will work better.

Upvotes: 0

Related Questions