user18562240
user18562240

Reputation:

How to find index of the first unique elements in Pandas DataFrame?

Consider

df1 = pd.DataFrame("Name":["Adam","Joseph","James","James","Kevin","Kevin","Kevin","Peter","Peter"])

I want to get the index of the unique values in the dataframe.

When I do df1["Name"].unique() I get the output as

['Adam','Joseph','James','Kevin','Peter']

But I want to get the location of the first occurrence of each value

[0,1,2,4,7]

Upvotes: 4

Views: 2360

Answers (4)

eshirvana
eshirvana

Reputation: 24568

numpy answer is great but here is one workaround :

out = df1.reset_index().groupby(['Name'])['index'].min().to_list()

output:

[0, 1, 2, 4, 7]

Upvotes: 3

Pavel
Pavel

Reputation: 101

First match = first location

In[49]: import pandas as pd
   ...: df1 = pd.DataFrame({"Name":["Adam","Joseph","James","James","Kevin","Kevin","Kevin","Peter","Peter"]})
   ...: print ([df1.loc[df1['Name']==i].index[0] for i in df1['Name'].unique()])
   ...: 
[0, 1, 2, 4, 7]

Upvotes: 0

Abhishek
Abhishek

Reputation: 1625

Check Below code using RANK

df1['rank'] = df1.groupby(['Name'])['Name'].rank(method='first')
df1[df1['rank'] == 1].index

Int64Index([0, 1, 2, 4, 7], dtype='int64')

Upvotes: 0

user16836078
user16836078

Reputation:

I would suggest to use numpy.unique with the return_index as True.

np.unique(df1, return_index=True)
Out[13]: 
(array(['Adam', 'James', 'Joseph', 'Kevin', 'Peter'], dtype=object),
 array([0, 2, 1, 4, 7], dtype=int64))

Upvotes: 2

Related Questions