How to hot encode a dataframe column with multiple strings?

Question

I am currently working on building a regressor model to predict the food delivery time.

This is the dataframe with a few observation

If you observe the Cuisines column has many strings. Used the code

pd.get_dummies(data.Cuisines.str.split(',',expand=True),prefix='c')

This helped me split the strings and hot encode, however, there is a new issue to be dealt with.

Merged the dataframe and dummies. fastfood appears in 1st and 3rd rows. Expected output was a single fastfood column with value 1 on first and third rows, however, there are two fastfood columns are created. fastfood(4th column) is created for first row and fastfood(15th column) for thrid row.

Can someone help me solve this help me get a single fastfood column with value 1 on first and third rows and similarly for the other cuisines too.

How to hot encode a dataframe column with multiple strings?

Answers (1)

Related Questions