Reputation: 693
I have a large pandas DataFrame consisting of some 100k rows and ~100 columns with different dtypes and arbitrary content.
I need to assert that it does not contain a certain value, let's say -1
.
Using assert( not (any(test1.isin([-1]).sum()>0)))
results in processing time of some seconds.
Any idea how to speed it up?
Upvotes: 1
Views: 87
Reputation: 4892
Just to make a full answer out of my comment:
With -1 not in test1.values
you can check if -1
is in your DataFrame.
Regarding the performance, this still needs to check every single value, which is in your case
10^5*10^2 = 10^7
.
You only save with this the performance cost for summation and an additional comparison of these results.
Upvotes: 1