Reputation: 3977
The code below compares a user input to names of US cities in a csv file. The file is comma-delimited
with a single column with a header. If the user input matches a row or string in the file, then the string in the file is returned, if there is no match, then the user input is returned.
In addition to returning data based on the exact match, how can I also return data from the file or the user input based on the number of matched characters?
Example:
User input: Brookly
string in file: Brooklyn
Output: Brooklyn
In the example above, only one character is different. And so I can say if the total character difference is one, then return string from file, else return user input.
The RemoveAllFormat
method in the code simply strip all formatting so that the two strings are compared.
Code:
public string MatchedCity(string input)
{
string cityMatch = null;
string[] cityList = null;
const string lookupFile = @"X:\city.csv";
using (StreamReader r = new StreamReader(lookupFile))
{
string refList = "";
while ((refList = r.ReadLine()) != null)
{
cityList = refList.Split(',');
foreach (string city in cityList)
{
if (String.Equals(RemoveAllFormat(input), RemoveAllFormat(city)))
{
cityMatch = city;
break;
}
else
{
continue;
}
}
if (string.IsNullOrEmpty(cityMatch) == false)
break;
else
continue;
}
}
if (string.IsNullOrEmpty(cityMatch) == true)
{
return input;
}
else
{
return cityMatch.Replace("\"", "");
}
}
Upvotes: 2
Views: 818
Reputation: 75585
You can compute Levenshtein distance using this code someone kindly posted here. It looks like there is another implementation here, under a more obvious license.
You can then decide how much distance you are willing to tolerate for "close enough", and output rows for which the distance is sufficiently small for your taste.
Upvotes: 4