Reputation: 981
I have the following dataframe
pd.DataFrame({'category': [1,2,1], 'names' : ['ab c', 's', 'dm ab aaa']})
category names
0 1 ab c
1 2 s
2 1 dm ab aaa
Really I need to find all unique tokens(separated by space) in names column, assign corresponding category and create new datafrane as you can see below:
pd.DataFrame({'category' : [1, 1,2,1,1,1], 'names' : ['ab', 'c', 's', 'dm', 'ab', 'aaa']})
category names
0 1 ab
1 1 c
2 2 s
3 1 dm
4 1 ab
5 1 aaa
Please help me and how to do it the best way...
Upvotes: 1
Views: 103
Reputation: 214957
You can split the names
column first and then explode
it:
df.assign(names = df.names.str.split()).explode('names')
# category names
#0 1 ab
#0 1 c
#1 2 s
#2 1 dm
#2 1 ab
#2 1 aaa
If you need to reset index (from @KRKirov's comment):
df.assign(names = df.names.str.split()).explode('names').reset_index(drop=True)
Upvotes: 1