AleB
AleB

Reputation: 173

Multiple condition over a variable number of columns

I have a pandas Dataframe with 2 fixed columns and a variable number of columns after these two. I need to change the values in the second column, depending on those of the others. The problem is that I don't know in advance how many additional columns I will have in my dataframe and I need to set a flexible way to check their value.

def validateAndSetSignals(self, signalsDf, datesReb):


        totSignals = pd.DataFrame(0, columns = ['TOT_SIGNAL','TRADING_DAY'], index = self.unqDates)
        for names in signalsDf.keys():
            tmpSign = signalsDf[names].sum(axis =1)
            totSignals[names] = tmpSign
            totSignals['TOT_SIGNAL'] = totSignals['TOT_SIGNAL'] + tmpSign




        for i in range (len(totSignals.columns) - 2):         
            totSignals.loc[totSignals[totSignals.columns[2:]] != 0, 'TRADING_DAY'] = 1

As you can see, 'TOT_SIGNAL' and 'TRADING_DAY' are the fixed columns, while I may have one or more depending on the for cycle at the beginning. Then I want to check when at least one of the additional columns is different from 0, and set 1 the column 'TRADING_DAY'.

I am in trouble because I am not able to create specific conditions, since I don't know in advance the number and the names of the additional columns. I tried using positional indexing, since all of them will be after the second, but It does not work. Is there a way to do that?

Upvotes: 0

Views: 928

Answers (2)

Stef
Stef

Reputation: 30579

Here's a solution using any and mask without apply:

df = pd.DataFrame(index=range(8), columns = ['TOT_SIGNAL','TRADING_DAY']).join(pd.DataFrame(np.eye(8, 5)))

df.TRADING_DAY = df.TRADING_DAY.mask((df.iloc[:,2:] != 0).any(axis=1), 1)

Result:

  TOT_SIGNAL TRADING_DAY    0    1    2    3    4
0        NaN           1  1.0  0.0  0.0  0.0  0.0
1        NaN           1  0.0  1.0  0.0  0.0  0.0
2        NaN           1  0.0  0.0  1.0  0.0  0.0
3        NaN           1  0.0  0.0  0.0  1.0  0.0
4        NaN           1  0.0  0.0  0.0  0.0  1.0
5        NaN         NaN  0.0  0.0  0.0  0.0  0.0
6        NaN         NaN  0.0  0.0  0.0  0.0  0.0
7        NaN         NaN  0.0  0.0  0.0  0.0  0.0

Upvotes: 2

SmileyProd
SmileyProd

Reputation: 796

You can use the apply function on all rows by following the method below.

First define a function to see if a row has at least one value different to zero:

def checkRow(row):
    for x in row:
        # Equivalent to if x != 0 or if x == True
        if x:
            return 1
    # If all columns for this row is equal to 0 return 0
    return 0

Then you can use the apply function on all columns after the two first ones and put the result in the TRADING_DAY column as follows:

totSignals.TRADING_DAY = totSignals.iloc[:, 2:].apply(lambda row: checkRow(row), axis=1)

To understand better the code, here is the offical documentation of all the function used:

DataFrame.iloc: to select the columns that you are interested in.

DataFrame.apply: to apply the function on every row of the DataFrame object.

Upvotes: 1

Related Questions