JR1
JR1

Reputation: 57

How can I split a numerical column depending on the value of another column?

I want to split a column into two separate columns based on whether the value in column A is true or false.

Example:

A       X   
True    3   
False   6   
True    2   
False   4   

A       Y   Z
True    3   
False       6
True    2   
False       4

I've found examples online of this with string manipulation, but I'm working with integers.

I can combine the columns by df[Y] + df[Z], but can't find a way to split them.

Upvotes: 1

Views: 46

Answers (2)

jezrael
jezrael

Reputation: 863226

Use double numpy.where:

df['Y'] = np.where(df.A, df.X, np.nan)
df['Z'] = np.where(~df.A, df.X, np.nan)

Or Series.where with Series.mask:

df['Y'] = df.X.where(df.A)
df['Z'] = df.X.mask(df.A)

print (df)
       A  X    Y    Z
0   True  3  3.0  NaN
1  False  6  NaN  6.0
2   True  2  2.0  NaN
3  False  4  NaN  4.0

Or numpy.select with () for masks:

df['Y'], df['Z'] = np.select([(df.A, ~df.A)], [df.X], default=np.nan)

print (df)
       A  X    Y    Z
0   True  3  3.0  NaN
1  False  6  NaN  6.0
2   True  2  2.0  NaN
3  False  4  NaN  4.0

If want empty strings change NaNs to '', but if next processing is necessary then it failed:

df['Y'], df['Z'] = np.select([(df.A, ~df.A)], [df.X], default='')

Or:

df['Y'] = np.where(df.A, df.X, '')
df['Z'] = np.where(~df.A, df.X, '')
print (df)
       A  X  Y  Z
0   True  3  3   
1  False  6     6
2   True  2  2   
3  False  4     4

Upvotes: 1

Nick Scriabin
Nick Scriabin

Reputation: 301

df['Y'] = df['X'][df['A']==True]
df['Z'] = df['X'][df['A']==False]

Upvotes: 0

Related Questions