psowa001
psowa001

Reputation: 823

How to extract unique initials

I have dataframe with skill names, I want to extract unique initials for these skills

Skill_name             Initials
Risk Management        RM
Scope Management       SM
Stakeholder Management StM

I tried regular expressions but it gives me SM in both cases. Any ideas?

Upvotes: 1

Views: 112

Answers (2)

psowa001
psowa001

Reputation: 823

I found another solution

unique = list()
def unique_initials(full_name):
    name_list = full_name.split()
    initials = ''
    for name in name_list:
        initials += name[0]

    if initials not in unique:
        unique.append(initials)
        return initials
    else:
        initials = ''
        i=0
        for name in name_list:
            if i==0:
                initials += name[:2]
                i +=1
            else:
                initials += name[0]
        return initials

Skills['Initials'] = Skills['Name'].apply(lambda x: unique_initials(x))

Upvotes: 0

Amir
Amir

Reputation: 2041

I would suggest iterating through the names like the snippet below, and saving all the existing initials in a set:

all_names = [
    'Risk Management',
    'Scope Management',
    'Stakeholder Management',
]
seen = set()
def find_initials(name, seen):
    first, last = name.split()
    for i in range(1, len(last)+1):
        for j in range(1, len(first) + 1):
            initials = first[:j] + last[:i]
            if initials not in seen:
                seen.add(initials)
                return initials
    # full name is found in seen!
    for i in range(100):
        initials = f'{first}{last}{i}'
        if initials not in seen:
            seen.add(initials)
            return initials

initials = [find_initials(name, seen) for name in all_names]
print(initials) # ['RM', 'SM', 'StM']

Upvotes: 1

Related Questions