Keenan Burke-Pitts
Keenan Burke-Pitts

Reputation: 475

Subset Dataframe by Filtered Column

I'm wondering what the most efficient way to update a dataframe I'm working with is.
The 'location' column has some locations that I'd like to filter out. I'd like to keep locations only with more than two items (> than two rows). head and tail of dfsnapshot of unique locations

Upvotes: 1

Views: 46

Answers (1)

NoahL
NoahL

Reputation: 158

This might be a little convoluted, but it should get the job done.

Get a list of all the locations with more than 2 occurrences:

counts = df['location'].value_counts()
filt = counts[counts > 2]

Filter the original data to pull out only the locations (keys()) that occur >2 times

filt2 = df['location'].isin(filt.keys())

Apply the filter

print(df[filt2])

Upvotes: 2

Related Questions