user14571893
user14571893

Reputation:

Select rows in a pandas DataFrame that match the first two different items of one column

I would like to display all rows in the DataFrame whose values under the column 'Nameid' correspond to the two first different values found in that column.

In the example below, the two first different values under the column named 'Nameid' are 1 and 2. I want to select all rows for which 'Nameid' equals either 1 or 2, and discard the rest. How do I do that?

What I have:

import pandas as pd

df = pd.DataFrame(data={
    'Nameid': [1, 2, 3, 1],
    'Name': ['Michael', 'Max', 'Susan', 'Michael'],
    'Project': ['S455', 'G874', 'B7445', 'Z874'],
})
display(df.head(10))

enter image description here

What I want:

enter image description here

Upvotes: 1

Views: 179

Answers (1)

jezrael
jezrael

Reputation: 863301

First sorting by column Nameid by DataFrame.sort_values:

df = df.sort_values('Nameid')

then use Series.isin with 2 first unique values by Series.unique:

df1 = df[df['Nameid'].isin(df['Nameid'].unique()[:2])].reset_index(drop=True)
print (df1)
   Nameid     Name Project
0       1  Michael    S455
1       1  Michael    Z874
2       2      Max    G874

Alternative with Series.drop_duplicates:

df1 = df[df['Nameid'].isin(df['Nameid'].drop_duplicates()[:2])].reset_index(drop=True)

EDIT: If want filter by equal or less like 2, thank you @DarrylG:

df2 = df[df['Nameid'] <= 2].reset_index(drop=True)

Upvotes: 1

Related Questions