Cagdas Kanar
Cagdas Kanar

Reputation: 763

looping a training model over model classes and changing dataframe name by model class

I am trying to automate a training pipeline and having some troubles to rename the input dataframes over changing model classes.

sample_train = [9]
sample_test = [18]
model_class = [0]


for i in sample_train:
    for j in model_class:
        # Define the training datasets -Filter the datasets with model selections
        trainX_M[j] = mldata_pd[(mldata_pd.sample_id == i) & (mldata_pd.training_set_band == j)].drop(
            ['conv_gv_band', 'sample_id', 'training_set_band'], axis=1)
        trainy_M[j] = mldata_pd[(mldata_pd.sample_id == i) & (
            mldata_pd.training_set_band == j)].iloc[:, mldata_pd.columns == 'conv_gv_band']

    trainX_M0, testX, trainy_M0, testy = train_test_split(trainX_M0, trainy_M0,
                                                          test_size=0.2,
                                                          random_state=42)

I expect to have trainX_M0 when model_class=0 but receive the error:

NameError: name 'trainX_M' is not defined

Upvotes: 0

Views: 434

Answers (2)

lidrariel
lidrariel

Reputation: 79

You try to set the value at position j in trainX_M[j] =, same with trainY_M[j] =, and from the error you can read that the array trainX_M is not defined before. In your posted code snipped I also can not see a definition. Are you sure there is one and it is spelled the same way?

If there is none, you could initialize the array trainX_M like that (I guess equal size as model_class):

trainX_M = [None] * len(model_class)

Upvotes: 0

Omer Anisfeld
Omer Anisfeld

Reputation: 1302

your variable is trainX_M and not trainX_M0 , cahgne to

trainX_M0[j] = mldata_pd[(mldata_pd.sample_id == i) & (mldata_pd.training_set_band == j)].drop(['conv_gv_band','sample_id','training_set_band'], axis=1)

or crete a list trainX_M and df append to it all of the matrixes per class

Upvotes: 1

Related Questions