Reputation: 1186
I was querying Stackoverflow to get some data (https://data.stackexchange.com/stackoverflow/query/new), and I have a data frame with Tags as a column. The tags originally were of the form
<html><css>
I managed to get them in the form of
html,css
I think an image of my Jupyter notebook can display it best:
How can I separate the tags so that they can become categorical variables, and I can transform them using something like get_dummies? Everything I've seen refers to actual lists, like [html,css], rather than just comma separated words.
Upvotes: 0
Views: 203
Reputation: 19885
For this purpose, we can use df['Tags'].str.get_dummies(',')
, which basically performs split
and converts each element to its own one-hot encoded column.
Upvotes: 1