Reputation: 51
Say I define the following dataframe:
df = pd.DataFrame({'AAA' : [4,5,6,7], 'BBB' : [10,20,30,40],'CCC' : [100,50,-30,-50]})
print(df.head())
and I want to generate a new column by using some condition on the existing column(s)
df['XXX'] = df.AAA < 6
print(df.head())
where the output generated would be:
AAA BBB CCC XXX
0 4 10 100 True
1 5 20 50 True
2 6 30 -30 False
3 7 40 -50 False
I'd like to be able to generate these conditions dynamically, for example by passing something like
"AAA < 6"
or
"AAA < 4 & BBB == 10"
and have these conditions evaluated accordingly.
Does pandas provide some mechanism to allow something like this? I know eval() could be an alternative, but also note that the above strings would also need to be given the df as a context, ie: "AAA < 6" => "df.AAA < 6"
Alternatively, could someone recommend me a package which would allow me to define these conditions and then generate a lambda or similar which could be used to evaluate?
Thank you
Upvotes: 3
Views: 239
Reputation: 862511
Use DataFrame.eval
:
df = pd.DataFrame({'AAA' : [4,5,6,7], 'BBB' : [10,20,30,40],'CCC' : [100,50,-30,-50]})
df['a'] = df.eval("AAA < 5 & BBB == 10")
df['b'] = df.eval("AAA < 6")
df['c'] = df.eval("AAA < 4 & BBB == 10")
print (df)
AAA BBB CCC a b c
0 4 10 100 True True False
1 5 20 50 False True False
2 6 30 -30 False False False
3 7 40 -50 False False False
More information about using this function with alternatives is Dynamic Expression Evaluation in pandas using pd.eval()
Upvotes: 3
Reputation: 88226
You could use pandas' eval
, which can evaluate a string describing operations:
cond = "AAA < 6"
df.eval(cond)
0 True
1 True
2 False
3 False
Name: AAA, dtype: bool
cond = "AAA < 6"
df.assign(XXX = df.eval(cond))
AAA BBB CCC XXX
0 4 10 100 True
1 5 20 50 True
2 6 30 -30 False
3 7 40 -50 False
cond = "AAA < 4 & BBB == 10"
df.assign(XXX = df.eval(cond))
AAA BBB CCC XXX
0 4 10 100 False
1 5 20 50 False
2 6 30 -30 False
3 7 40 -50 False
Upvotes: 3