Reputation: 57
I have a pandas dataframe containing 3 columns.
This is what it looks like:
User History New
101 [X,Y,Z] [A-0,B-1]
102 [Q,M,N] [A-1,B-0]
I would like to modify my dataframe to be represented this way:
User History New 0or1
101 [X,Y,Z] A 0
101 [X,Y,Z] B 1
102 [Q,M,N] A 1
102 [Q,M,N] B 0
How can I do so?
Basically, the reason I’m doing this is because I’m trying to create a model which predicts 0 or 1 for each element in new based on the the history. Hence, I thought splitting them this way would make sense to train the model based on the three columns.
Though I was looking for ways to split the dataframe as described, I’m open to suggestions if there’s any other efficient way I can use the data provided (first table) to create a model to predict 1 or 0 for each element in array ‘new’ for the respective history.
Thanks in advance.
Upvotes: 0
Views: 89
Reputation: 79228
import pandas as pd
df1 = df.explode('New')
pd.concat([df1,df1.New.str.split('-', expand = True)],axis=1)
User History New 0 1
0 101 [X, Y, Z] A-0 A 0
0 101 [X, Y, Z] B-1 B 1
1 102 [Q, M, N] A-1 A 1
1 102 [Q, M, N] B-0 B 0
Upvotes: 1