Reputation:
In the below code I'm trying to create a new variable var1 which is identical to a new variable var2, except that it's null if var1 is greater than 2021/1/1.
df_jan['var2'] = df_jan['var1'].apply(lambda x: np.nan if x['var1']>pd.Timestamp(2021,1,20))
I'm just getting a "syntax error" response. What am I doing wrong?
Upvotes: 0
Views: 35
Reputation: 24314
you have to use else
statement as well if you are using if
statement inside apply()
method:
so try:
df_jan['var2'] = df_jan['var1'].apply(lambda x: np.nan if x>pd.Timestamp(2021,1,20) else x)
btw apply()
is loop under the hood so for better performance you can use:
Series.mask()
method:
df_jan['var2']=df_jan['var1'].mask(df_jan['var1']>pd.Timestamp(2021,1,20))
OR
Series.where()
method:
df_jan['var2']=df_jan['var1'].where(~(df_jan['var1']>pd.Timestamp(2021,1,20)))
Upvotes: 0