Reputation: 84455
I am currently looking at scikit learn
's preprocessing
functions.
I wanted to know if i can loop over a pre-defined list of pre-processing functions such that i don't have to write out in full the set up code for each function.
E.g. code for one function:
T = preprocessing.MinMaxScaler()
X_train = T.fit_transform(X_train)
X_test = T.transform(X_test)
My attempt to loop over a pre-defined list so as to use different pre-processing functions:
pre_proc = ['Normalizer','MaxAbsScaler','MinMaxScaler','KernelCenterer', 'StandardScaler']
for proc in pre_proc:
T = 'preprocessing.'+ proc +'()'
X_train = T.fit_transform(X_train)
X_test = T.transform(X_test)
Currently this is yielding the following which is not surprising:
--> 37 X_train = T.fit_transform(X_train)
38 X_test = T.transform(X_test)
39 for i in np.arange(startpt_c,endpt_c, step_c):
AttributeError: 'str' object has no attribute 'fit_transform'
I think i need to have the string as the correct object type to then call the method on i.e. have it recognised as a function.
Is there a way i can do this that satisfies my objective of using a loop?
Setup: Windows 8
, 64 bit
machine running Python 3
via Jupyter notebook
in Azure ML studio
.
Upvotes: 0
Views: 923
Reputation: 7957
The problem lies in this line of your code
pre_proc = ['Normalizer','MaxAbsScaler','MinMaxScaler','KernelCenterer', ...
What you are doing here is creating a list pre_proc
that is basically just a list of strings. Python has no idea that you actually meant them to be functions. And so when you try to use T = 'preprocessing.'+ proc +'()'
, python throws an error and say, that T
is a string and has not method such as fit_transform
. So instead of using strings, use the actual function names, i.e., don't put them in quotes. Use them like so -
pre_proc = [preprocessing.Normalizer, preprocessing.MaxAbsScalar, preprocessing.MinMaxScalar, preprocessing.KernelCenterer, preprocessing.StandardScaler]
Upvotes: 2