Kolibril
Kolibril

Reputation: 1443

Filtering DataFrame with a mean treshhold

I have a DataFrame, and I want to keep only columns, when their mean is over a certain treshhold.

My code looks like this:

import pandas as pd
df =  pd.DataFrame(np.random.random((20,20)))
mean_keep= (df.mean() > 0.5)
mean_keep= mean_keep[mean_keep == True]
df_new = df[mean_keep.index]

and it is working. However I wonder if there is a function like "TAKE_ONLY_COLUMNS" that can reduce this to one line like

df_new = df[TAKE_ONLY_COLUMNS(df.mean() > 0.5)]

Upvotes: 1

Views: 43

Answers (1)

anky
anky

Reputation: 75120

Use df.loc[] here:

df_new=df.loc[:,df.mean() > 0.5]
print(df_new)

This will automatically keep the columns where the condition is True.

Upvotes: 1

Related Questions