How to call a created funcion with pandas apply to all rows (axis=1) but only to some specific rows of a dataframe?

Question

I have a function which sends automated messages to clients, and takes as input all the columns from a dataframe like the one below.

name	phone	status	date
name_1	phone_1	sending	today
name_2	phone_2	sending	yesterday

I iterate through the dataframe with a pandas apply (axis=1) and use the values on the columns of each row as inputs to my function. At the end of it, after sending, it changes the status to "sent". The thing is I only want to send to the clients whose date reference is "today". Now, with pandas.apply(axis=1) this is perfectly doable, but in order to slice the clients with "today" value, I need to:

create a new dataframe with today's value,
remove it from the original, and then
reappend it to the original.

I thought about running through the whole dataframe and ignore the rows which have dates different than "today", but if my dataframe keeps growing, I'm afraid of the whole process becoming slower.

I saw examples of this being done with mask, although usually people only use 1 column, and I need more than just the one. Is there any way to do this with pandas apply?

Thank you.

vamsi_s · Accepted Answer

I think you can use .loc to filter the data and apply func to it.

In [13]: df = pd.DataFrame(np.random.rand(5,5))

In [14]: df
Out[14]:
          0         1         2         3         4
0  0.085870  0.013683  0.221890  0.533393  0.622122
1  0.191646  0.331533  0.259235  0.847078  0.649680
2  0.334781  0.521263  0.402030  0.973504  0.903314
3  0.189793  0.251130  0.983956  0.536816  0.703726
4  0.902107  0.226398  0.596697  0.489761  0.535270

if we want double the values of rows where the value in first column > 0.3

Out[16]:
          0         1         2         3         4
2  0.334781  0.521263  0.402030  0.973504  0.903314
4  0.902107  0.226398  0.596697  0.489761  0.535270

In [18]: df.loc[df[0] > 0.3] = df.loc[df[0] > 0.3].apply(lambda x: x*2, axis=1)

In [19]: df
Out[19]:
          0         1         2         3         4
0  0.085870  0.013683  0.221890  0.533393  0.622122
1  0.191646  0.331533  0.259235  0.847078  0.649680
2  0.669563  1.042527  0.804061  1.947008  1.806628
3  0.189793  0.251130  0.983956  0.536816  0.703726
4  1.804213  0.452797  1.193394  0.979522  1.070540

How to call a created funcion with pandas apply to all rows (axis=1) but only to some specific rows of a dataframe?

Answers (1)

Related Questions