Reputation: 1
I am currently working on building a regressor model to predict the food delivery time.
This is the dataframe with a few observation
If you observe the Cuisines column has many strings. Used the code
pd.get_dummies(data.Cuisines.str.split(',',expand=True),prefix='c')
This helped me split the strings and hot encode, however, there is a new issue to be dealt with.
Merged the dataframe and dummies. fastfood appears in 1st and 3rd rows. Expected output was a single fastfood column with value 1 on first and third rows, however, there are two fastfood columns are created. fastfood(4th column) is created for first row and fastfood(15th column) for thrid row.
Can someone help me solve this help me get a single fastfood column with value 1 on first and third rows and similarly for the other cuisines too.
Upvotes: 0
Views: 64
Reputation: 150725
The two Fast Food
are different by a trailing space. You probably want to try:
data.Cuisines.str.get_dummies(',\s*')
Upvotes: 1