Pachino
Pachino

Reputation: 11

how to deal with different ways to write the same thing

I wanna know if Django has any module to deal with this problem. I have multiple ways of writing the same city name in a Postgresql database that came from scraping different websites. The field "city name" could be "S. Diego" or "San Diego". My question is if I could have a module that could normalize always to "San Diego" in both situations and I could add some normalization when some new word appear like "S Diego", and maintain this workflow.

Thanks

Upvotes: 0

Views: 36

Answers (1)

Deniz Kaplan
Deniz Kaplan

Reputation: 1609

You can use an API to normalize the data you have scraped. Yandex or Google have feature to return a possible list of the location names based on your search query. Get the most possible answer they returned and use it to map your input to the correct one. There are manual mapping features but I highly recommend one of the giants that solved the problem before us.

Upvotes: 1

Related Questions