Jui Sen
Jui Sen

Reputation: 377

Creating new column from existing columns

I have a data-frame

ID   P_1   P_2
1    NaN   NaN
2    124   342
3    NaN   234
4    123   NaN
5    2345  500

I want to make a new column titled P_3 such that:

ID   P_1   P_2  P_3
1    NaN   NaN   NaN
2    124   342   342
3    NaN   234   234
4    123   NaN   123
5    2345  500  500

My conditions are:

if P_1 = Nan , then P_3 == P_2
if P_1 != Nan and P_2 != Nan, then  P_3 == P_2
if P_2 = Nan , then P_3 == P_1

I have applied the following codes:

conditions = [
    (df['P_1'] == float('NaN')),
    (df['P_1'] != float('NaN')) & (df['P_2'] != float('NaN')),
    (df['P_1'] != float('NaN')) & (df['P_2'] == float('NaN'))
    ]

values = [df['P_2'], df['P_2'], df['P_1']]

df['P_3'] = np.select(conditions, values)

But it gives me the following error:

Length of values does not match length of index

Upvotes: 1

Views: 62

Answers (2)

inspectorG4dget
inspectorG4dget

Reputation: 113955

Another approach:

In [93]: df                                                                                                                                                                                                                                                                   
Out[93]: 
       p1     p2
0     NaN    NaN
1   124.0  342.0
2     NaN  234.0
3   123.0    NaN
4  2345.0  500.0

In [94]: df['p3'] = df.p2                                                                                                                                                                                                                                                     

In [95]: df                                                                                                                                                                                                                                                                   
Out[95]: 
       p1     p2     p3
0     NaN    NaN    NaN
1   124.0  342.0  342.0
2     NaN  234.0  234.0
3   123.0    NaN    NaN
4  2345.0  500.0  500.0

In [96]: df.loc[df.p3.isna(), 'p3'] = df[df.p3.isna()]['p1']                                                                                                                                                                                                                  

In [97]: df                                                                                                                                                                                                                                                                   
Out[97]: 
       p1     p2     p3
0     NaN    NaN    NaN
1   124.0  342.0  342.0
2     NaN  234.0  234.0
3   123.0    NaN  123.0
4  2345.0  500.0  500.0

Upvotes: 2

Corralien
Corralien

Reputation: 120409

In summary, your unique condition is:

P_3 = P_2 if P_2 != NaN else P_1

combine_first: update null elements with value in the same location in other (ref: Pandas doc.)

>>> df["P_2"].combine_first(df["P_1"])
ID
1      NaN
2    342.0
3    234.0
4    123.0
5    500.0

Upvotes: 2

Related Questions