Reputation: 34
I'm trying to use a Levenstein algo I found here to clean a huge amount of data. But having trouble implementing it.
I have 100,000 rows of excel data.
One of the columns contains a city name, these have multiple typos (hence levenstein)
I have a list of 1,000,000 cities around the world. I am trying to find the best way to query both sets of data and return the correct spelling of the city for each instance.
Currently I have =IF(Levenshtein(J5,$K$4)<4, $K$4, "No Match")
But this would need to repeated for each of my one million cities individually.
My ideal solution is to add another column, insert a formula and end up with the correct spelling of that cities name.
Upvotes: 0
Views: 727
Reputation: 766
This 'Fuzzy Lookup Add-In for Excel' should do the trick :)
http://www.microsoft.com/en-gb/download/details.aspx?id=15011
Upvotes: 0