satyam sangeet
satyam sangeet

Reputation: 95

MLP Training on Multiple data

This question is a follow up to my previous question which can be found [here][1].

I have two datasets with three training parameters and one output (binary classification). I used the methodology detailed [here][2] to build a generator to take the training dataset one at a time to train my model. But once I run my model I get a value error for my passed dataset.

file1 = "/home/Documents/t1.csv"
file2 = "/home/Documents/t2.csv"
files = [file1,file2]

model = Sequential()
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(64, activation='relu'))
model.add(Dense(1, activation='sigmoid'))

def BatchGenerator(files):
    for file in files:
        df = pd.read_csv(file)
        X_train = df.drop(["output"],axis=1)
        y_train = df["output"]
        yield (X_train, y_train)

n_epochs = 100
for epoch in range(n_epochs):
    for (X_train, y_train) in BatchGenerator(files):
        model.fit(X_train, y_train, batch_size = 32, nb_epoch = 1)
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

I get the following error:

ValueError                                Traceback (most recent call last)
<ipython-input-63-0b14a52c7fba> in <module>
 37 for epoch in range(n_epochs):
 38     for (X_train, y_train) in BatchGenerator(files):
---> 39         model.fit(X_train, y_train, batch_size = 32, nb_epoch = 1)
 40 model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

~/anaconda3/lib/python3.7/site-packages/keras/engine/training.py in fit(self, x, y, 
batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, 
class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, 
**kwargs)
948             sample_weight=sample_weight,
949             class_weight=class_weight,
--> 950             batch_size=batch_size)
951         # Prepare validation data.
952         do_validation = False

~/anaconda3/lib/python3.7/site-packages/keras/engine/training.py in 
_standardize_user_data(self, x, y, sample_weight, class_weight, check_array_lengths, 
batch_size)
662                                      'either a single '
663                                      'array or a list of arrays. '
--> 664                                      'You passed: x=' + str(x))
665                 all_inputs.append(x)
666 

ValueError: Please provide as model inputs either a single array or a list of arrays. 
You passed: x=         in1     in2  in3
0     2.1282  5.8809  0.0
1     2.9293  1.1067  0.0
2     2.5568  0.8797  0.0
3     2.9293  1.1067  0.0
4     0.0000  0.7009  0.0
...      ...     ...  ...
1268  1.2085  0.9672  0.0
1269  0.0000  0.7009  0.0
1270  3.4218  3.6143  0.0
1271  1.9270  0.8991  0.0
1272  2.1109  0.8390  0.0

[1273 rows x 3 columns]

Any help would be much appreciated. Thanks [1]: Training MLP on multiple csv files [2]: Training a Neural Network with Multiple Datasets (Keras)

Upvotes: 0

Views: 254

Answers (1)

Szatan
Szatan

Reputation: 11

You need to define input layer to Sequential model so it knows what input vector to expect.

...
model = Sequential()
model.add(layers.Input(shape=(number_of_features_to_network),))
...

Second problem you are doing model.compile after you call model.fit. First you call compile method then fit method

Upvotes: 1

Related Questions