StackSuperLow
StackSuperLow

Reputation: 39

Pandas filtering by column list

I want to create a function that returns a data frame that is DataFrame 'data' filtered to include only the columns specified by my list good_columns.

def filter_by_columns(data,columns):
   data = data[[good_columns]] #this is running an error when calling for my next line for: 

filter_data = fileter_by_columns(data, good_columns)

Upvotes: 0

Views: 71

Answers (1)

sophocles
sophocles

Reputation: 13821

Instead of creating a function, you can the following:

Assuming your main dataframe is called df, you can create a new one with only the columns you specify using the below code,

cols_to_keep = ['c1','c2','c3'] # just enter the column names you want to keep
data = df[[cols_to_keep]]

Or, equivalently, change your function to this:

I created a random df:

   col1  col2  col3  col4  col5
0     1     5     1     5    10
1     2     4     2     4    20
2     3     3     3     3    30
3     4     2     4     2    40
4     5     1     5     1    50

Altered your function to this:

def filter_by_columns(data,good_columns):
   data = df[good_columns]  # have only 1 set of brackets here
   return data

good_columns = ['col1','col2','col3'] # assign the columns you need
filter_data = filter_by_columns(df,good_columns)

and the new filter_data prints:

   col1  col2  col3
0     1     5     1
1     2     4     2
2     3     3     3
3     4     2     4
4     5     1     5

Upvotes: 1

Related Questions