user1844086
user1844086

Reputation:

how to null pandas variable based on date comparison

In the below code I'm trying to create a new variable var1 which is identical to a new variable var2, except that it's null if var1 is greater than 2021/1/1.

df_jan['var2'] = df_jan['var1'].apply(lambda x: np.nan if x['var1']>pd.Timestamp(2021,1,20))

I'm just getting a "syntax error" response. What am I doing wrong?

Upvotes: 0

Views: 35

Answers (1)

Anurag Dabas
Anurag Dabas

Reputation: 24314

you have to use else statement as well if you are using if statement inside apply() method:

so try:

df_jan['var2'] = df_jan['var1'].apply(lambda x: np.nan if x>pd.Timestamp(2021,1,20) else x)

btw apply() is loop under the hood so for better performance you can use:

Series.mask() method:

df_jan['var2']=df_jan['var1'].mask(df_jan['var1']>pd.Timestamp(2021,1,20))

OR

Series.where() method:

df_jan['var2']=df_jan['var1'].where(~(df_jan['var1']>pd.Timestamp(2021,1,20)))

Upvotes: 0

Related Questions