Reputation: 495
I have a pandas dataframe and want to select rows where certain column is in 2 largest values. Output should show rows where 'duration' is 50 and 45
I tried
data = {
"production": [420, 380, 390],
"duration": [50, 40, 45]
}
df = pd.DataFrame(data)
df[df['production'] == df['production'].nlargest(2)]
ValueError: Can only compare identically-labeled Series objects
Upvotes: 2
Views: 52
Reputation: 14949
TRY:
result = df[df['production'].isin(df['production'].nlargest(2))]
Or if you want all the population that lies within these 2 values:
result = df[df['production'].between(*df['production'].nlargest(2).values[::-1])]
Upvotes: 2