Reputation: 51
If I have pandas data like this:
s1 s2 s3
1 None 1
1 2 1
2 2 2
1 2 None
I want to add a new column 's' whose value will be None if the values of s1, s2 and s3 don't match. If they match (I want to ignore None in this comparision) the value should be the common value. So the output will be
s1 s2 s3 s
1 None 1 1 (Ignoring None in comparision here)
1 2 1 None
2 2 2 2
1 2 None None
What is the best way to introduce this new conditional column in pandas?
Upvotes: 1
Views: 848
Reputation:
Assuming your columns are numeric and None's are treated as NaN's, you can do:
df['s'] = np.where(df.std(axis=1)==0, df.mean(axis=1), np.nan)
df
Out:
s1 s2 s3 s
0 1 NaN 1.0 1.0
1 1 2.0 1.0 NaN
2 2 2.0 2.0 2.0
3 1 2.0 NaN NaN
This is based on the fact that if all values are equal, the standard deviation of that row will be 0, and the mean will be equal to the those numbers. Both mean and the standard deviation calculations ignore NaNs.
If the first assumption is not correct, please replace None's first:
df = df.replace({'None': np.nan})
where np is numpy (import numpy as np
).
Upvotes: 1