nilsinelabore
nilsinelabore

Reputation: 5095

How to do kfold cross-validation for multi-input models

The model is as below:

inputs_1 = keras.Input(shape=(10081,1))

layer1 = Conv1D(64,14)(inputs_1)
layer2 = layers.MaxPool1D(5)(layer1)
layer3 = Conv1D(64, 14)(layer2)
layer4 = layers.GlobalMaxPooling1D()(layer3)

inputs_2 = keras.Input(shape=(85,))            
layer5 = layers.concatenate([layer4, inputs_2])
layer6 = Dense(128, activation='relu')(layer5)
layer7 = Dense(2, activation='softmax')(layer6)

model_2 = keras.models.Model(inputs = [inputs_1, inputs_2], output = [layer7])

X_train, X_test, y_train, y_test = train_test_split(df.iloc[:,0:10166], df[['Result_cat','Result_cat1']].values, test_size=0.2) 
X_train = X_train.to_numpy()
X_train = X_train.reshape([X_train.shape[0], X_train.shape[1], 1]) 
X_train_1 = X_train[:,0:10081,:]
X_train_2 = X_train[:,10081:10166,:].reshape(736,85)  

X_test = X_test.to_numpy()
X_test = X_test.reshape([X_test.shape[0], X_test.shape[1], 1]) 
X_test_1 = X_test[:,0:10081,:]
X_test_2 = X_test[:,10081:10166,:].reshape(185,85)    

adam = keras.optimizers.Adam(lr = 0.0005)
model_2.compile(loss = 'categorical_crossentropy', optimizer = adam, metrics = ['acc'])
history = model_2.fit([X_train_1,X_train_2], y_train, epochs = 120, batch_size = 256, validation_split = 0.2, callbacks = [keras.callbacks.EarlyStopping(monitor='val_loss', patience=20)])

Questions:

1) The data is 921rows x 10166columns. Each row is an observation(first 10080 columns being a time series with remaining columns being other statistics features). According to the model, is the input data split into inputs_1 and inputs_2 randomly?

2) I am thinking about doing a kfold cross-validation and splitting the input data into inputs_1 and inputs_2. What is a good way to do this? Thanks

Upvotes: 1

Views: 868

Answers (1)

Jaafar Mahmoud
Jaafar Mahmoud

Reputation: 26

By splitting only indexes.

num_folds = 5
kfold = KFold(n_splits=num_folds, shuffle=False)
ID_Inp = np.array(range(nSamples))
ID_Out = np.array(range(nSamples))
Inputs = [Input1,Input2]

for IDs_Train, IDs_Test in kfold.split(ID_Inp, ID_Out):
  Fold_Train_Input1, Fold_Train_Input2 = Input1[IDs_Train], Input2[IDs_Train]
  Fold_Train_OutPut = Output[IDs_Train]

  Fold_Test_Input1, Fold_Test_Input2 = Input1[IDs_Test], Input2[IDs_Test]
  Fold_Test_OutPut = Output[IDs_Test]
  ####################

Upvotes: 1

Related Questions