Max Holloway
Max Holloway

Reputation: 34

Levenshtein excel for huge set of data

I'm trying to use a Levenstein algo I found here to clean a huge amount of data. But having trouble implementing it.

I have 100,000 rows of excel data.

One of the columns contains a city name, these have multiple typos (hence levenstein)

I have a list of 1,000,000 cities around the world. I am trying to find the best way to query both sets of data and return the correct spelling of the city for each instance.

Currently I have =IF(Levenshtein(J5,$K$4)<4, $K$4, "No Match")

But this would need to repeated for each of my one million cities individually.

My ideal solution is to add another column, insert a formula and end up with the correct spelling of that cities name.

Upvotes: 0

Views: 727

Answers (1)

Tim
Tim

Reputation: 766

This 'Fuzzy Lookup Add-In for Excel' should do the trick :)

http://www.microsoft.com/en-gb/download/details.aspx?id=15011

Upvotes: 0

Related Questions