Reputation: 1
A number of values in column have similar but differently spelled names.
How do I combine the different spellings as one categorical value?
array(['Individual', 'Trust', 'LLC', nan, 'individual', 'Partnership', 'INdividual', 'Corporation', 'Individual ', 'Corporation ', 'Trust '], dtype=object)
I want to combine all spellings of individual, corporation, and trust. Then I want to combine all individual and non-trusts as one new dummy variable.
I found 'Rename Misspelled Categorical values in Python'this, but the code didn't seem applicable. Also, looked up lambda functions.
Upvotes: 0
Views: 17
Reputation: 1
#replace various substrings
#replace different spellings of individual
df['type'] = df['type'].replace('individual', 'Individual')
df['type'] = df['type'].replace('INdividual', 'Individual')
df['type'] = df['type'].replace('Individual ', 'Individual')
#replace different spellings of trust
df['type'] = df['type'].replace('Trust ', 'Trust')
#replace spellings of corporation
df['type'] = df['type'].replace('Corporation ', 'Corporation')
Upvotes: 0