Reputation: 2443
My system is python3.6, with numpy 1.16.2,scipy 1.2.1,matplotlib 3.0.3
import pandas as pd
import numpy
df=pd.DataFrame({'col1':['a','b','c'],'col2':['d',numpy.NaN,'c'],'col3':['c','b','b']})
df = df.astype({"col2": 'category'})
print(df)
output of above script is:
col1 col2 col3
0 a d c
1 b NaN b
2 c c b
I want to find index of the not-null item in series col2
whose category is not in ['a','b','c']
In this case, d
is not null
and is not in ['a','b','c']
,then the expect result should be the index of d
,which is 0
My solution as blow:
getindex=numpy.where(~df['col2'].isin(['a','b','c']) & df['col2'].notna())
#if getindex is not empty, print it
if not all(getindex):
print(getindex)
The output of my solution script is:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
Upvotes: 1
Views: 384
Reputation: 863226
Use:
getindex=df.index[(~df['col2'].isin(['a','b','c']) & df['col2'].notna())]
print (getindex)
Int64Index([0], dtype='int64')
If want select first value with no error if value not exist:
print (next(iter(getindex), 'no match'))
0
If want if empty
statement use Index.empty
for testing:
if not getindex.empty:
print (getindex)
Your solution should working if add [0]
for select first array from list:
getindex=np.where(~df['col2'].isin(['a','b','c']) & df['col2'].notna())[0]
print (getindex)
[0]
Upvotes: 2
Reputation: 323326
Do modify in you if condition
getindex=np.where(~df['col2'].isin(['a','b','c']) & df['col2'].notna())
if any(~df['col2'].isin(['a','b','c']) & df['col2'].notna()): # change here to any
print(getindex)
(array([0], dtype=int64),)
Also base on your word #if getindex is not empty, print it
if len(getindex)!=0:
print(getindex)
(array([0], dtype=int64),)
Upvotes: 1