Shabazz
Shabazz

Reputation: 1

How to rename multiple spellings of a categorical variable as one value

A number of values in column have similar but differently spelled names.

How do I combine the different spellings as one categorical value?

array(['Individual', 'Trust', 'LLC', nan, 'individual', 'Partnership', 'INdividual', 'Corporation', 'Individual ', 'Corporation ', 'Trust '], dtype=object)

I want to combine all spellings of individual, corporation, and trust. Then I want to combine all individual and non-trusts as one new dummy variable.

I found 'Rename Misspelled Categorical values in Python'this, but the code didn't seem applicable. Also, looked up lambda functions.

Upvotes: 0

Views: 17

Answers (1)

Shabazz
Shabazz

Reputation: 1

#replace various substrings

#replace different spellings of individual
df['type'] = df['type'].replace('individual', 'Individual')
df['type'] = df['type'].replace('INdividual', 'Individual')
df['type'] = df['type'].replace('Individual ', 'Individual')

#replace different spellings of trust
df['type'] = df['type'].replace('Trust ', 'Trust')

#replace spellings of corporation
df['type'] = df['type'].replace('Corporation ', 'Corporation')

Upvotes: 0

Related Questions