user1965813
user1965813

Reputation: 671

Pandas not finding elements in Columns

Pandas does not seem to find all elements in a list:

df = pd.DataFrame({"rid": ["125264429", "a"], "id": [1, 2]})
1 in df["id"]                # <- expect True, get True
"125264429" in df["rid"]     # <- expect True, get False
df[df["rid"] == "125264429"] # <- yields result

I am sure there is a perfectly reasonable explanation for this behaviour, but I can't seem to find it. It seems that the last two columns contradict each other. Does it have to do something with the fact that the datatype of the "rid" column is object?

Upvotes: 3

Views: 1449

Answers (2)

Lutz
Lutz

Reputation: 655

I am not sure what in will do here but definitely not what you want (e.g. asking for 2 in df["id"] returns false as well)

The problem is that you use in not with a List or Set. So you have two options:

df["rid"].isin(["125264429"]).any()

or

"125264429" in df["rid"].to_list()

(ok probably about a million more but these are the easy ones I can see)

Upvotes: 2

jezrael
jezrael

Reputation: 863166

If use in operator it test not values of Series/column, but index values, docs:

print(1 in df["id"])              # <- expect True, get True
print("125264429" in df["rid"])     # <- expect True, get False 

is same like:

print(1 in df["id"].index)              # <- expect True, get True
print("125264429" in df["rid"].index)     # <- expect True, get False

So if convert values to numpy array or list it working like expected:

print(1 in df["id"].values)              # <- expect True, get True
print("125264429" in df["rid"].values)     # <- expect True, get True

print(1 in df["id"].tolist())              # <- expect True, get True
print("125264429" in df["rid"].tolist())     # <- expect True, get True

Upvotes: 6

Related Questions