Reputation: 173
I have a pandas Dataframe with 2 fixed columns and a variable number of columns after these two. I need to change the values in the second column, depending on those of the others. The problem is that I don't know in advance how many additional columns I will have in my dataframe and I need to set a flexible way to check their value.
def validateAndSetSignals(self, signalsDf, datesReb):
totSignals = pd.DataFrame(0, columns = ['TOT_SIGNAL','TRADING_DAY'], index = self.unqDates)
for names in signalsDf.keys():
tmpSign = signalsDf[names].sum(axis =1)
totSignals[names] = tmpSign
totSignals['TOT_SIGNAL'] = totSignals['TOT_SIGNAL'] + tmpSign
for i in range (len(totSignals.columns) - 2):
totSignals.loc[totSignals[totSignals.columns[2:]] != 0, 'TRADING_DAY'] = 1
As you can see, 'TOT_SIGNAL'
and 'TRADING_DAY'
are the fixed columns, while I may have one or more depending on the for cycle at the beginning. Then I want to check when at least one of the additional columns is different from 0, and set 1 the column 'TRADING_DAY'
.
I am in trouble because I am not able to create specific conditions, since I don't know in advance the number and the names of the additional columns. I tried using positional indexing, since all of them will be after the second, but It does not work. Is there a way to do that?
Upvotes: 0
Views: 928
Reputation: 30579
Here's a solution using any
and mask
without apply
:
df = pd.DataFrame(index=range(8), columns = ['TOT_SIGNAL','TRADING_DAY']).join(pd.DataFrame(np.eye(8, 5)))
df.TRADING_DAY = df.TRADING_DAY.mask((df.iloc[:,2:] != 0).any(axis=1), 1)
Result:
TOT_SIGNAL TRADING_DAY 0 1 2 3 4
0 NaN 1 1.0 0.0 0.0 0.0 0.0
1 NaN 1 0.0 1.0 0.0 0.0 0.0
2 NaN 1 0.0 0.0 1.0 0.0 0.0
3 NaN 1 0.0 0.0 0.0 1.0 0.0
4 NaN 1 0.0 0.0 0.0 0.0 1.0
5 NaN NaN 0.0 0.0 0.0 0.0 0.0
6 NaN NaN 0.0 0.0 0.0 0.0 0.0
7 NaN NaN 0.0 0.0 0.0 0.0 0.0
Upvotes: 2
Reputation: 796
You can use the apply function on all rows by following the method below.
First define a function to see if a row has at least one value different to zero:
def checkRow(row):
for x in row:
# Equivalent to if x != 0 or if x == True
if x:
return 1
# If all columns for this row is equal to 0 return 0
return 0
Then you can use the apply function on all columns after the two first ones and put the result in the TRADING_DAY
column as follows:
totSignals.TRADING_DAY = totSignals.iloc[:, 2:].apply(lambda row: checkRow(row), axis=1)
To understand better the code, here is the offical documentation of all the function used:
DataFrame.iloc: to select the columns that you are interested in.
DataFrame.apply: to apply the function on every row of the DataFrame object.
Upvotes: 1