AndreasInfo
AndreasInfo

Reputation: 1227

pandas: How to get the value_counts() above a threshold

How can I get the value_counts above a threshold? I tried

df[df[col].value_counts(dropna=False) > 3]

to get all counts greater than 3, but I am getting

IndexingError: Unalignable boolean Series provided as indexer (index of the boolean Series and of the indexed object do not match).

Any hint? Thanks

Upvotes: 4

Views: 8342

Answers (3)

Sarah Amundrud
Sarah Amundrud

Reputation: 26

Sticking with value_counts, here's a simple solution:

df[col].value_counts(dropna=False)[df[col].value_counts(dropna=False) > 3]

Upvotes: 0

BENY
BENY

Reputation: 323386

Try with isin and chain with your original value_counts

out = df[df.col.isin(df[col].value_counts(dropna=False).loc[lambda x : x>3].index)].copy()

Also Let us try filter

out = df.groupby(col).filter(lambda x : len(x)>3)

Upvotes: 3

Quang Hoang
Quang Hoang

Reputation: 150815

Try:

df[df.groupby(col)[col].transform('size')>3]

Or with value_counts:

counts = df[col].value_counts(dropna=False) 
valids = counts[counts>3].index

df[df[col].isin(valids)]

Another approach with value_counts and map:

counts = df[col].value_counts(dropna=False)
df[df[col].map(counts)>3]

Upvotes: 8

Related Questions