Reputation: 475
I'm wondering what the most efficient way to update a dataframe I'm working with is.
The 'location' column has some locations that I'd like to filter out. I'd like to keep locations only with more than two items (> than two rows).
Upvotes: 1
Views: 46
Reputation: 158
This might be a little convoluted, but it should get the job done.
Get a list of all the locations with more than 2 occurrences:
counts = df['location'].value_counts()
filt = counts[counts > 2]
Filter the original data to pull out only the locations (keys()
) that occur >2 times
filt2 = df['location'].isin(filt.keys())
Apply the filter
print(df[filt2])
Upvotes: 2