wangke99
wangke99

Reputation: 151

Converting user nickname to formal first name in Python

I am trying to mapping users from different systems based on user first and last name in Python.

One issue is that the first names are in many cases 'nicknames.' For example, for a user, his first name is 'Dave' in one system, and 'David' in another.

Is there any easy way in python to convert common nicknames like these to their formal counterparts?

Thanks!

Upvotes: 4

Views: 4577

Answers (3)

goFrendiAsgard
goFrendiAsgard

Reputation: 4084

In [1]: first_name_dict = {'David':['Dave']}
In [2]: def get_real_first_name(name):
   ...:     for first_name in first_name_dict:
   ...:         if first_name == name:
   ...:             return name
   ...:         elif name in first_name_dict[first_name]:
   ...:             return first_name
   ...:         else:
   ...:             return name
   ...:         

In [3]: get_real_first_name('David')
Out[3]: 'David'

In [4]: get_real_first_name('Dave')
Out[4]: 'David'

I'm using Ipython. Basically you need a dictionary to do that. The first_name_dict is your first name dictionary. For example, David can be called as "Dave" or "Davy", and Lucas can be called as "Luke", then you can write the dictionary like:

first_name_dict = {'David' : ['Dave', 'Davy'], 'Lucas' : ['Luke']}

You can improve the solution by adding "case-insensitive" matching

Upvotes: 0

jdotjdot
jdotjdot

Reputation: 17072

Not within Python specifically, but try using this:

http://deron.meranda.us/data/nicknames.txt

If you load that data into python (csv.reader(<FileObject>, delimiter='\t')), you can then do a weighted probability-type function to return a full name for the nicknames in that list.

You could do something like this:

import collections

def weighted_choice_sub(weights):
    # Source for this function:
    #  http://eli.thegreenplace.net/2010/01/22/weighted-random-generation-in-python/
    rnd = random.random() * sum(weights)
    for i, w in enumerate(weights):
        rnd -= w
        if rnd < 0:
            return i

def load_names():
   with open(<filename>, 'r') as infile:
      outdict = collections.defaultdict(list)
      for line in infile.readlines():
          tmp = line.strip().split('\t')
          outdict[tmp[0]].append((tmp[1], float(tmp[2])))
   return outdict


def full_name(nickname):
    names = load_names()
    return names[nickname][weighted_choice_sub([x[1] for x in names[nickname]])][0]

Upvotes: 5

RonaldBarzell
RonaldBarzell

Reputation: 3830

You'd have to create a database or hash mapping nicknames onto formal names. If you can find such a list online, the process of implementing the map will be trivial. The real fun will be getting a complete enough list, ensuring variations are taken care of, and making sure you don't run into problems when people's formal names ARE their nicknames. Not everyone who goes by Dave has a formal name of David for example. The person's formal name may very well be Dave.

Upvotes: 0

Related Questions