Reputation: 1596
I have a set of values as a numpy array. I want to find the row indices where the value in the numpy array first appear
data = {'name': ['Jason', 'Molly', 'Tina', 'Jake', 'Amy'], 'year': [2012, 2012, 2013, 2014, 2014], 'reports': [4, 24, 31, 2, 3]}
df = pd.DataFrame(data)
mid = np.array([2012,2013])
I want to find the row indices of the first appearances of the values 2012 and 2013 in the year column. My expected answer should be
[0,2]
As a matter of fact row ids of any single appearance index will be ok with me. That is, I am ok with the answer
[1,2]
also
Upvotes: 1
Views: 65
Reputation: 862681
If there is default index it is same like positions and all values are sorted use Series.searchsorted
:
idx = df['year'].searchsorted(mid).tolist()
print (idx)
[0, 2]
General solution with Series.isin
in boolean indexing
and DataFrame.drop_duplicates
for first values, last convert index to list:
idx = df[df['year'].isin(mid)].drop_duplicates('year').index.tolist()
print (idx)
[0, 2]
Upvotes: 1