Reputation: 39
I want to create a function that returns a data frame that is DataFrame 'data' filtered to include only the columns specified by my list good_columns.
def filter_by_columns(data,columns):
data = data[[good_columns]] #this is running an error when calling for my next line for:
filter_data = fileter_by_columns(data, good_columns)
Upvotes: 0
Views: 71
Reputation: 13821
Instead of creating a function, you can the following:
Assuming your main dataframe
is called df
, you can create a new one with only the columns you specify using the below code,
cols_to_keep = ['c1','c2','c3'] # just enter the column names you want to keep
data = df[[cols_to_keep]]
Or, equivalently, change your function to this:
I created a random df
:
col1 col2 col3 col4 col5
0 1 5 1 5 10
1 2 4 2 4 20
2 3 3 3 3 30
3 4 2 4 2 40
4 5 1 5 1 50
Altered your function
to this:
def filter_by_columns(data,good_columns):
data = df[good_columns] # have only 1 set of brackets here
return data
good_columns = ['col1','col2','col3'] # assign the columns you need
filter_data = filter_by_columns(df,good_columns)
and the new filter_data prints
:
col1 col2 col3
0 1 5 1
1 2 4 2
2 3 3 3
3 4 2 4
4 5 1 5
Upvotes: 1