Reputation: 33
As a beginer in Python, I'm trying to reference similar R-sintax for a task of data wrangling in a pandas DataFrame, but this have not been successful for the ifelse statement inside a mutate function.
# R code
df <- data.frame(var1 = c('2020-12-01','2020-12-02',NA,NA,'2020-12-05'),
var2 = c('start','start','start','start','start')
stringsAsFactors = F)
df <- df %>% dplyr::mutate(var2 = ifelse(!is.na(var1), 'complete', var1))
Some advice about the way to obtain the same result using Python-sintax?
Upvotes: 2
Views: 496
Reputation: 887691
Try with numpy.where
df.var2 = np.where(df.var1.isnull(), np.nan, 'complete')
Or another option that is similar to base R
is create a logical index and use that in replacement
i1 = df.var1.isnull()
df.loc[i1, 'var2'] = np.nan
df.loc[~i1, 'var2'] = 'complete'
-output
df
# var1 var2
#0 2020-12-01 complete
#1 2020-12-02 complete
#2 NaN NaN
#3 NaN NaN
#4 2020-12-05 complete
import numpy as np
import pandas as pd
df = pd.DataFrame({"var1":['2020-12-01','2020-12-02',np.nan,np.nan,'2020-12-05'],
"var2": ['start','start','start','start','start']})
Upvotes: 2