Vranvs
Vranvs

Reputation: 1521

Update column based on value from another column

I have a dataframe like so:

  HNCO_iloc    C_shift HNcoCA_iloc CA_shift i_FROM  i_TO
0        25  180.40469        None     None     36  None
0        70  179.52209        None     None     46  None
0       137  178.47112        None     None     49  None
0       157  178.35808        None     None     36  None
0       204  177.14190        None     None     32  None
0       221  176.99482        None     None     26  None
0       261  176.12322        None     None     16  None
0       269  176.08582        None     None     36  None
0       270  176.09042        None     None     36  None
0       313  175.27206        None     None     24  None
0       322  175.18865        None     None     31  None
0       332  174.78010        None     None     43  None
0       342  174.34505        None     None     45  None
0       352  173.76740        None     None      1  None
0       358  173.46177        None     None      3  None
0       363  172.97259        None     None      3  None

I want to be able to access the 'CA_shift' column using the values in 'i_FROM', and update that cell. So, for example, if I wanted to change the row that has i_FROM equal to 3, I would want the following output:

  HNCO_iloc    C_shift HNcoCA_iloc CA_shift i_FROM  i_TO
0        25  180.40469        None     None     36  None
0        70  179.52209        None     None     46  None
0       137  178.47112        None     None     49  None
0       157  178.35808        None     None     36  None
0       204  177.14190        None     None     32  None
0       221  176.99482        None     None     26  None
0       261  176.12322        None     None     16  None
0       269  176.08582        None     None     36  None
0       270  176.09042        None     None     36  None
0       313  175.27206        None     None     24  None
0       322  175.18865        None     None     31  None
0       332  174.78010        None     None     43  None
0       342  174.34505        None     None     45  None
0       352  173.76740        None     None      1  None
0       358  173.46177        None     test      3  None
0       363  172.97259        None     test      3  None

Right now I am doing:

connectivity_df.loc[connectivity_df['i_FROM' == 3], 'CA_shift'] = 'test'

But I am continually getting KeyErrors.

Upvotes: 1

Views: 47

Answers (1)

Celius Stingher
Celius Stingher

Reputation: 18367

You can solve this with np.where(). The code would be as follows:

df = pd.DataFrame({'CA_shift':[None,None,None,None,None,None],'i_FROM':[36,46,49,36,3,3]})

1st case: Replacing one to one:

example_value = 3
df['CA_shift'] = np.where(df['i_FROM'] == example_value,'test',df['CA_shift'])

Output:

  CA_shift  i_FROM
0     None      36
1     None      46
2     None      49
3     None      36
4     test       3
5     test       3

This operations works in a vectorized way like an if statement, where if the value for the row matches example_value it'll get updated to test, otherwise it'll return the original value for CA_shift.

2nd case: Replacing many to one:

mult_values = [3,46,49]
df['CA_shift'] = np.where(df['i_FROM'].isin(mult_values),'test',df['CA_shift'])
print(df)

Output:

  CA_shift  i_FROM
0     None      36
1     test      46
2     test      49
3     None      36
4     test       3
5     test       3

3rd case: Replacing many to many:

mult_values = [3,46,49]
conditions = [df['i_FROM']==mult_values[0],df['i_FROM']==mult_values[1],df['i_FROM']==mult_values[2]]
choices = ['test1','test2','test3']
df['CA_shift'] = np.select(conditions,choices,default=df['CA_shift'])
print(df)

Output:

  CA_shift  i_FROM
0        0      36
1    test2      46
2    test3      49
3        0      36
4    test1       3
5    test1       3

Upvotes: 1

Related Questions