I'm trying to overwrite a new column within a DataFrame based on a conditional expression: if df['service'] == 'PE1' or 'PE2': Change the existing value in df['service'] to equal the original df['service'] + df['load port']. # if ['load port] == 'ABC' then new value == PE1ABC else: keep the original value in df['service'] # in other words != 'PE1' or 'PE2'. I'm trying to use the .merge() to "VLOOKUP" from another DataFrame. However, the 'PE1' and 'PE2' services require the load port. All other services have a 1:1 assignment.

Reputation: 29

How to Overwrite a Series Value through Conditional Expression?

I'm trying to overwrite a new column within a DataFrame based on a conditional expression:

if df['service'] == 'PE1' or 'PE2':
Change the existing value in df['service'] to equal the original df['service'] + df['load port'].
# if ['load port] == 'ABC' then new value == PE1ABC
else: keep the original value in df['service'] # in other words != 'PE1' or 'PE2'.

I'm trying to use the .merge() to "VLOOKUP" from another DataFrame. However, the 'PE1' and 'PE2' services require the load port. All other services have a 1:1 assignment.

Upvotes: 0

Answers (2)

Ben.T

Reputation: 29635

you can use numpy.where to perform the task such as:

import numpy as np
df['service'] = np.where((df['service'] =='PE1')|(df['service'] =='PE2'), #conditions
                          df['service']+df['load port'], #result if conditions are met
                          df['service']) # result if not

The method with apply from @Lorran Sutter is good but if your dataframe is big, this method will be faster.

Upvotes: 0

Lorran Sutter

Reputation: 528

You may define a function with your conditions, than use apply function to change your column.

Example data frame:

import pandas as pd

df = pd.DataFrame({'service':['PE1','PE2','bla','ble','PE2'],\
                   'load port':['ABC','TEST','BLA','BLA','BLE']})

Output:

  load port service
0       ABC     PE1
1      TEST     PE2
2       BLA     bla
3       BLA     ble
4       BLE     PE2

Change function:

def changeService(row):
    if row['service'] == 'PE1' or row['service'] == 'PE2':
        return row['service'] + row['load port']
    return row['service']

Apply change function so as to overwrite your column:

df['service'] = df.apply(changeService, axis = 1)

Output:

  load port  service
0       ABC   PE1ABC
1      TEST  PE2TEST
2       BLA      bla
3       BLA      ble
4       BLE   PE2BLE

Note: It is recommended your change function always have a return, otherwise some rows will be filled with NaN values.

Upvotes: 1

How to Overwrite a Series Value through Conditional Expression?

Answers (2)

Related Questions