thepez87
thepez87

Reputation: 221

How do I create a new column in a dataframe based on conditions of other columns?

I have a data frame that looks like this:

                             TransactionId   Value
Timestamp                                     
2018-01-07 22:00:00.000         633025      674.87
2018-01-07 22:15:00.000         633025      676.11
2018-01-07 22:30:00.000         633025      677.06

I want to create a third column with 3 possible classes based on conditions of the other 2 columns. I tried writing a function below, but it's not working - I don't get a return when I call df.head() after calling the function.

b = df.shape[0]
def charger_state(df):
    a = 1
    while a <= b: 
        if df.Value[a]-df.Value[(a-1)] > 0.1 :
            df['Charger State']= "Charging"
        elif df.Value[a]-df.Value[(a-1)] < 0.1 \
        and df['TransactionId'] > 0:
            df['Charger State']= "Not Charging"
        else: 
            df['Charger State']= "Vacant"
    a = a+1

The other answers surrounding this topic don't seem to cover 3 classes for the new column, but I am a novice so just might not get it.

Upvotes: 3

Views: 84

Answers (1)

user3483203
user3483203

Reputation: 51165

First, setup your conditions:

c1 = df.Value.sub(df.Value.shift()).gt(0.1)
c2 = df.Value.diff().lt(0.1) & df.TransactionId.gt(0)

Now use np.select:

df.assign(ChargerState=np.select([c1, c2], ['Charging', 'Not Charging'], 'Vacant'))

                     TransactionId   Value ChargerState
Timestamp
2018-01-07 22:00:00         633025  674.87       Vacant
2018-01-07 22:15:00         633025  676.11     Charging
2018-01-07 22:30:00         633025  677.06     Charging

You may need to tweak c1, because in this example, although it has both a TransactionId and a Value, it shows up as Vacant, because there is no previous row.

One potential option would be to assume that if a device has a Value and TransactionID, it has started charging, which we could accomplish using fillna on c1:

c1 = df.Value.sub(df.Value.shift().fillna(0)).gt(0.1)    # Notice the fillna
c2 = df.Value.diff().lt(0.1) & df.TransactionId.gt(0)

df.assign(ChargerState=np.select([c1, c2], ['Charging', 'Not Charging'], 'Vacant'))

                     TransactionId   Value ChargerState
Timestamp
2018-01-07 22:00:00         633025  674.87     Charging
2018-01-07 22:15:00         633025  676.11     Charging
2018-01-07 22:30:00         633025  677.06     Charging

Upvotes: 3

Related Questions