Shahnawaz
Shahnawaz

Reputation: 57

Pandas dataframe column split

I have a pandas dataframe containing 3 columns.

This is what it looks like:

User History New
101  [X,Y,Z] [A-0,B-1] 
102  [Q,M,N] [A-1,B-0]   

I would like to modify my dataframe to be represented this way:

User History New 0or1
101  [X,Y,Z] A   0
101  [X,Y,Z] B   1
102  [Q,M,N] A   1
102  [Q,M,N] B   0

How can I do so?

Basically, the reason I’m doing this is because I’m trying to create a model which predicts 0 or 1 for each element in new based on the the history. Hence, I thought splitting them this way would make sense to train the model based on the three columns.

Though I was looking for ways to split the dataframe as described, I’m open to suggestions if there’s any other efficient way I can use the data provided (first table) to create a model to predict 1 or 0 for each element in array ‘new’ for the respective history.

Thanks in advance.

Upvotes: 0

Views: 89

Answers (1)

Onyambu
Onyambu

Reputation: 79228

import pandas as pd 
df1 = df.explode('New')
pd.concat([df1,df1.New.str.split('-', expand = True)],axis=1)
   User    History  New  0  1
0   101  [X, Y, Z]  A-0  A  0
0   101  [X, Y, Z]  B-1  B  1
1   102  [Q, M, N]  A-1  A  1
1   102  [Q, M, N]  B-0  B  0

Upvotes: 1

Related Questions