Comparing names in different formats using Python

Question

I want to compare names which are in different formats, eg: "George W. Bush", "George Bush", "George Walker Bush", "Bush, George Walker", "Bush, GW", "Bush, George" etc. There are few with dots (".") as well, but I omitted those from the list because I will normalize those anyways. In fact, the commas (",") will be stripped as well.

What is the best and optimized approach to determine if any 2 given names actually represent the same person? I have thought of using nameparser and build a comparison algorithm, but please provide any other possible options. Any approach using standard modules of Python will be fine too.

Neo · Accepted Answer

There's an open source library which can be useful, or at least can be used as base to build more functionalities.

https://github.com/rliebz/whoswho

Sample usage:

>>> from whoswho import who
>>> who.match('Bush, G.W.', 'George W. Bush')

Comparing names in different formats using Python

Answers (2)

Related Questions