Bo Sanders
Bo Sanders

Reputation: 13

How do I get a list of records containing only True value in pandas

I have a pandas dataframe containing customer events with various columns. Some events appear more than once. I wanted to put all those events in a list. I did this:

dup_evets=[df_in['EVENTS'].value_counts()>1]

This placed all events in a list and added True/False to each event based on the check whether it appears more than 1 time.

How do I remove the False ones from the list?

Upvotes: 1

Views: 832

Answers (1)

Andreas
Andreas

Reputation: 9197

You can do this:

df_in[df_in['EVENTS'].duplicated()]['EVENTS'].tolist()

Explained:

# Returns Series of booleans, called a mask.
mask = df_in['EVENTS'].duplicated()

# Slice (filter) dataframe based on boolean series, only returning the True ones
df_in[mask]

# Get column you are interested in
df_in[mask]['EVENTS']

# Return list of the values in it
df_in[mask]['EVENTS'].tolist()

If you want to have other amounts and not only find duplicates, you can use this:

df_in[df_in.groupby(['EVENTS'])['EVENTS'].transform('count')>1]['EVENTS'].tolist()

Upvotes: 1

Related Questions