Reputation: 1998
I have df below:
id | status | id_reference
1 True NaN
4 False 1
2 False Nan
7 False 3
6 True 2
10 True 4
How can I only filter or want the rows in the df, where:
the status column is True and the id_reference column value exists in another row in the df as a value in the id column as a value.
So based on above df,
I would want rows below:
id | status | id_reference
6 True 2
10 True 4
actual df is much larger as this is a sample
Upvotes: 2
Views: 135
Reputation: 22503
Cast id_reference
to float
if you haven't already and use isin
:
print (df.loc[df["status"]&df["id_reference"].astype(float).isin(df["id"])])
id status id_reference
4 6 True 2
5 10 True 4
Upvotes: 3
Reputation: 46
def filterDataSet(self,dataSetValues):
try :
print(dataSetValues) dataSetValues= dataSetValues.fillna(0)
dataSetValues = dataSetValues[dataSetValues.id_reference>0]
dataSetValues = dataSetValues[dataSetValues.status == True]
return dataSetValues
except ValueError as ve:
print("Exception while filter data ", ve)
Upvotes: 0
Reputation: 3663
You can try this:
df = df.query('status == True & id_reference.notnull()')
print(df)
Upvotes: 0
Reputation: 354
Here is what you can try:
df.loc[df["status"] & df["id_ref"].isin(df["id"])]
Upvotes: 2