Reputation: 486
I am trying to apply a function over a column in a dataframe if one of the column i.e. df['mask'] contain False it should skip that row. mask column is bool type
this is mine function
def dates(inp):
temp = inp
parser = CommonRegex()
inp = inp.apply(parser.dates).str.join(', ')
return np.where(inp.apply(parser.dates).str.len() == 0, temp, 'X' * random.randrange(3, 8))
here what i have applied
df1.assign(**df1['Dates'].apply(dates).where(df1['mask']== TRUE))
Its throwing error
32 temp = inp
33 parser = CommonRegex()
---> 34 inp = inp.apply(parser.dates).str.join(', ')
35 return np.where(inp.apply(parser.dates).str.len() == 0, temp, 'X' * random.randrange(3, 8))
36
AttributeError: 'Timestamp' object has no attribute 'apply'
Here is mine dataframe look like
Name | Dates | mask |
..............................
Tom | 21/02/2018| True
Nick | 28/07/2018| False
Juli | 11/08/2018| True
June | 01/02/2018| True
XHGM | 07/08/2018| False
I am trying to get output in this way that for false value it should skip and and for true value it should call date function and hide the data values
Name | Dates | mask |
..............................
Tom | XXXXX | True
Nick |28/07/2018 | False
Juli | XXXXX | True
June | XXXXX | True
XHGM | 07/08/2018| False
Upvotes: 1
Views: 51
Reputation: 862406
Use Series.pipe
for pass columns to function and also filter rows with boolean indexing
by mask and DataFrame.loc
for specify column name:
df1.loc[df1['mask'], 'Dates'] = df1.loc[df1['mask'], 'Dates'].pipe(dates)
print (df1)
Name Dates mask
0 Tom XXX True
1 Nick 28/07/2018 False
2 Juli XXX True
3 June XXX True
4 XHGM 07/08/2018 False
Solution with assign
is possible too, but disadvantage is function loop by all values and then filtering, so if only few True
s values in large Dataframe
should be slowier:
df1 = df1.assign(Dates = np.where(df1['mask'], df1['Dates'].pipe(dates), df1['Dates']))
Upvotes: 1