Reputation: 305
GENERAL QUESTION
I was wondering if there exists a Python opposite to __contains__
(i.e., something like __notcontains__
).
MY EXAMPLE
I need it for the following piece of code:
df_1 = df[(df.id1 != id1_array) | (df.id2.apply(id2_array.__contains__)]
df_2 = df[(df.id1 == id1_array) & (df.id2.apply(id2_array.__notcontains__)]
In other words, in df1
I want only observations for which id1
is not in id1_array1
or id2
is in id2_array
, while for df2
I want only observations for which id1
is in id1_array
and id2
is not in id2_array
.
Who can help me out here? Thanks in advance!
Upvotes: 4
Views: 4312
Reputation: 1
opposite of __contains___
one way to use this as follows:
list.__contains__('ABC') #true if ABC is present in list
not list.__contains__('ABC') # false if ABC is present in list
Upvotes: 0
Reputation: 1
Sorry for the (very) late response. If what you're tryin' is to analize wether a character is or not in a string, you could check this out! It isn't very optimized but it might work :))
while yourCharacter == False:
stringVariable = str(input("text"))
for characterPosition in range(0, len(stringVariable)):
characterTest = stringVariable[characterPosition]
if characterTest == "yourCharacter":
yourCharacter = True
this (as you probably know), will able you to use the yourCharacter variable to check if the character is in the string or input.
I hope it helps somehow, and again, sorry for the late response :)
Upvotes: 0
Reputation: 394031
To answer how to do this in pure pandas you can use isin
and use the negation operator ~
to invert the boolean series:
df_1 = df[(df.id1 != id1_array) | (df.id2.isin(id2_array)]
df_2 = df[(df.id1 == id1_array) & (~df.id2.isin(id2_array)]
This will be faster than using apply
on a larger dataset as isin
is vectorised
When using the comparison operators such as ==
and !=
this will return True/False
where the array values are same/different in the same position. If you are testing just for membership, i.e. does a list of values exist anywhere in the array then use isin
this will also return a boolean series where matches are found, to invert the array use ~
.
Also as a general rule, avoid using apply
unless it's not possible, the reason is that apply
is just syntactic sugar to execute a for
loop on the df and this isn't vectorised. There are usually ways to achieve the same result without using apply
if you dig hard enough.
Upvotes: 3
Reputation: 4387
EDIT: I didn't notice this was using panda's specifically. My answer may not be accurate.
Generally, the magic functions (anything with __'s before and after) are not meant to be called directly. In this case, __contains__ is referenced by using the in
keyword.
>>> a = ['b']
>>> 'b' in a
True
>>> 'b' not in a
False
Upvotes: 2
Reputation: 1178
No there is no __notcontains__
method or similar. When using x not in y
, the method __contains__
is actually used, as shown bellow:
class MyList(list):
def __contains__(self, x):
print("__contains__ is called")
return super().__contains__(x)
l = MyList([1, 2, 3])
1 in l
# __contains__ is called
1 not in l
# __contains__ is called
Upvotes: 2