Reputation: 55
I have a dataframe:
BPR_free_speed BPR_speed Volume time_normalised free_capacity
0 17.88 15.913662 580 1.593750 475.0
1 17.88 15.865198 588 2.041667 475.0
2 17.88 16.511613 475 0.666667 475.0
3 17.88 16.882837 401 1.091458 467.0
4 99999 16.703004 438 1.479167 467.0
5 17.88 16.553928 467 0.960417 467.0
How can I get a Series on special conditions?
I want to find outliers and put them in the series df["has_outliers"]
, like if a row has a value more than 550 in any column, then True, otherwise False.
The output for this dataframe should be
has_outliers
0 True
1 True
2 False
3 False
4 True
5 False
I think it can be done even using numpy, but how to do it?
Upvotes: 2
Views: 53
Reputation: 862511
Compare by DataFrame.gt
with DataFrame.any
for check at least one True per row:
df["has_outliers"] = df.gt(500).any(axis=1)
Or count True
s and cast to integers:
df["has_outliers"] = df.gt(500).sum(axis=1).astype(bool)
Upvotes: 4