Reputation: 13
I have a list of job titles (12,000 in total) formatted in this way:
Career_List = ['1) ABLE SEAMAN', '2) ABRASIVE GRADER', '3) ABRASIVE GRINDER']
How do I remove the numbers, parentheses, and spaces from the list elements so that I end up with this output:
Career_List_Updated = ['ABLE SEAMAN', 'ABRASIVE GRADER', 'ABRASIVE GRINDER']
I know that I am unable to simply remove the first three characters because I have more than ten items in my list.
Upvotes: 1
Views: 1782
Reputation: 77900
Split each career at the first space; keep the rest of the line.
Career_List = ['1) ABLE SEAMAN', '2) ABRASIVE GRADER', '3) ABRASIVE GRINDER', '12000) ZEBRA CLEANER']
Career_List_Updated = []
for career in Career_List:
job = career.split(' ', 1)
Career_List_Updated.append(job[1])
print Career_List_Updated
Output:
['ABLE SEAMAN', 'ABRASIVE GRADER', 'ABRASIVE GRINDER', 'ZEBRA CLEANER']
One-line version:
Career_List_Updated = [career.split(' ', 1)[1] \
for career in Career_List]
Upvotes: 1
Reputation: 597
Take advantage of the fact that str.lstrip()
and the rest of the strip
functions accept multiple characters as an argument.
Career_List_Updated =[career.lstrip('0123456789) ') for career in Career_List]
Upvotes: 2
Reputation: 498
We want to find the first index that STOPS being a bad character and return the rest of the string, as follows.
def strip_bad_starting_characters_from_string(string):
bad_chars = set(r"'0123456789 )") # set of characters we don't like
for i, char in enumerate(string):
if char not in bad_chars
# we are at first index past "noise" digits
return string[i:]
career_list_updated = [strip_bad_starting_characters_from_string(string) for string in career_list]
Upvotes: 0