Sphery
Sphery

Reputation: 35

np.where problem when creating new columns in pandas df based on other column content

I have the following df named Band_data:

Band name  Band players
B1         P1        
B2         P1; P2    

Goal is to get this df in the following shape:

Band name  P1  P2
B1         1   0     
B2         1   1  

The following doesn't seem to work:

Players = ['P1', 'P2']
for player in Players:
    Band_data[player] = np.where(player in Band_data["Band players"], 1, 0)
Band_data.drop(["Band players"], axis = 1)

because it returns:

Band name  P1  P2
B1         0   0     
B2         0   0 

The goal is of course to use this for arbitrary many bands in the df, this is just a small example. Why is this not the correct way to do it and how to implement this correctly?

Upvotes: 0

Views: 38

Answers (1)

BENY
BENY

Reputation: 323316

IIUC get_dummies

yourdf=df.set_index('Band name')['Band layers'].str.get_dummies(' ;').reset_index()
yourdf
  Bandname  P1  P2
0       B1   1   0
1       B2   1   1

Upvotes: 2

Related Questions