Reputation: 823
I have dataframe with skill names, I want to extract unique initials for these skills
Skill_name Initials
Risk Management RM
Scope Management SM
Stakeholder Management StM
I tried regular expressions but it gives me SM in both cases. Any ideas?
Upvotes: 1
Views: 112
Reputation: 823
I found another solution
unique = list()
def unique_initials(full_name):
name_list = full_name.split()
initials = ''
for name in name_list:
initials += name[0]
if initials not in unique:
unique.append(initials)
return initials
else:
initials = ''
i=0
for name in name_list:
if i==0:
initials += name[:2]
i +=1
else:
initials += name[0]
return initials
Skills['Initials'] = Skills['Name'].apply(lambda x: unique_initials(x))
Upvotes: 0
Reputation: 2041
I would suggest iterating through the names like the snippet below, and saving all the existing initials in a set:
all_names = [
'Risk Management',
'Scope Management',
'Stakeholder Management',
]
seen = set()
def find_initials(name, seen):
first, last = name.split()
for i in range(1, len(last)+1):
for j in range(1, len(first) + 1):
initials = first[:j] + last[:i]
if initials not in seen:
seen.add(initials)
return initials
# full name is found in seen!
for i in range(100):
initials = f'{first}{last}{i}'
if initials not in seen:
seen.add(initials)
return initials
initials = [find_initials(name, seen) for name in all_names]
print(initials) # ['RM', 'SM', 'StM']
Upvotes: 1