
Reputation: 21

GridSearchCV error: ValueError: Sequential model 'sequential' has no defined outputs yet

I am trying to fine tune the hyperparameters for my deep learning neural network on a dataset which I have done feature engineering on. I have kept only relevant features and have standardized the data as well (using MinMaxScaler). I have followed the steps that I have seen online to find the best parameters:

  1. Feature engineering/Data standardization (Pre-processing)
  2. Making a build function of the neural network
  3. Creating a KerasRegressor object with that neural network
  4. Create parameters dictionary that I wish to test
  5. Create a GridSearchCV object with the KerasRegressor object as the estimator and the param_grid as the parameters dictionary
  6. Fitting the data using a training set (from train_test_split)
  7. Printing best_params_

However I ran into an error:

Traceback (most recent call last):
  File "C:\Users\vishv\anaconda3\Lib\site-packages\joblib\externals\loky\", line 428, in _process_worker
    r = call_item()
  File "C:\Users\vishv\anaconda3\Lib\site-packages\joblib\externals\loky\", line 275, in __call__
    return self.fn(*self.args, **self.kwargs)
  File "C:\Users\vishv\anaconda3\Lib\site-packages\joblib\", line 620, in __call__
    return self.func(*args, **kwargs)
  File "C:\Users\vishv\anaconda3\Lib\site-packages\joblib\", line 288, in __call__
    return [func(*args, **kwargs)
  File "C:\Users\vishv\anaconda3\Lib\site-packages\joblib\", line 288, in <listcomp>
    return [func(*args, **kwargs)
  File "C:\Users\vishv\anaconda3\Lib\site-packages\sklearn\utils\", line 127, in __call__
    return self.function(*args, **kwargs)
  File "C:\Users\vishv\anaconda3\Lib\site-packages\sklearn\model_selection\", line 732, in _fit_and_score, y_train, **fit_params)
  File "C:\Users\vishv\anaconda3\Lib\site-packages\scikeras\", line 760, in fit
  File "C:\Users\vishv\anaconda3\Lib\site-packages\scikeras\", line 926, in _fit
  File "C:\Users\vishv\anaconda3\Lib\site-packages\scikeras\", line 549, in _check_model_compatibility
    if self.n_outputs_expected_ != len(self.model_.outputs):
  File "C:\Users\vishv\anaconda3\Lib\site-packages\keras\src\models\", line 277, in outputs
    raise ValueError(
ValueError: Sequential model 'sequential' has no defined outputs yet.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "c:\Users\vishv\OneDrive\Documents\Projects and Personal Learning\Spotify Top 200 Chart Analysis\", line 100, in <module>
    grid =,y_train)
  File "C:\Users\vishv\anaconda3\Lib\site-packages\sklearn\", line 1151, in wrapper
    return fit_method(estimator, *args, **kwargs)
  File "C:\Users\vishv\anaconda3\Lib\site-packages\sklearn\model_selection\", line 898, in fit
  File "C:\Users\vishv\anaconda3\Lib\site-packages\sklearn\model_selection\", line 1419, in _run_search
  File "C:\Users\vishv\anaconda3\Lib\site-packages\sklearn\model_selection\", line 845, in evaluate_candidates
    out = parallel(
  File "C:\Users\vishv\anaconda3\Lib\site-packages\sklearn\utils\", line 65, in __call__
    return super().__call__(iterable_with_config)
  File "C:\Users\vishv\anaconda3\Lib\site-packages\joblib\", line 1098, in __call__
  File "C:\Users\vishv\anaconda3\Lib\site-packages\joblib\", line 975, in retrieve
  File "C:\Users\vishv\anaconda3\Lib\site-packages\joblib\", line 567, in wrap_future_result
    return future.result(timeout=timeout)
  File "C:\Users\vishv\anaconda3\Lib\concurrent\futures\", line 456, in result
    return self.__get_result()
  File "C:\Users\vishv\anaconda3\Lib\concurrent\futures\", line 401, in __get_result
    raise self._exception
ValueError: Sequential model 'sequential' has no defined outputs yet.

Below is my code. Note that I am fairly new to machine learning and neural nets:

# DataFrame Libraries
import pandas as pd
import numpy as np
import random as rnd

# Visualization Libraries
import matplotlib.pyplot as plt
from pandasgui import show
import seaborn as sns

# Machine Learning Libraries
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.metrics import r2_score
from sklearn.preprocessing import MinMaxScaler
import tensorflow as tf
from tensorflow.keras.models import Sequential
from scikeras.wrappers import KerasRegressor
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.metrics import R2Score
from tensorflow.keras.callbacks import EarlyStopping

# Read in Data
spotify_df = pd.read_csv('spotify_top_songs_audio_features.csv',index_col="id")

# Clean Data
    # Dropping source, mode, key, time_signature (no/little correlation to features)
spotify_df.drop(['source','mode', 'key', 'time_signature'],axis=1,inplace=True)

    # Mapping outlier in artist_names (Tyler, The Creator -> Tyler The Creator) 
def tyler_map(artist_names):
    if 'Tyler, The Creator' in artist_names:
        return artist_names.replace('Tyler, The Creator','Tyler The Creator')
        return artist_names

spotify_df['artist_names'] = spotify_df['artist_names'].apply(tyler_map)

    # Splitting artist names into lists of each artist + making dummies for each artist
spotify_df['artist_names'] = spotify_df['artist_names'].apply(lambda x:x.split(", "))

artist_dummy = pd.get_dummies(data=spotify_df['artist_names'].explode(),drop_first=True).groupby(level=0).sum()

    # Concat dummies to original list (without artist_names)
spotify_df = pd.concat([spotify_df.drop('artist_names',axis=1),artist_dummy],axis=1)

X = spotify_df.iloc[:,13:]
y = spotify_df['weeks_on_chart']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)

scaler = MinMaxScaler()

X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

early_stop = EarlyStopping(monitor='val_loss', mode='min', verbose=0, patience=25)

def buildModel(optimizer='adam'):
    model = Sequential()

    model.add(Dense(234, activation = 'relu'))

    for i in range(2):
        model.add(Dense(78, activation = 'relu'))

        model.add(Dense(78, activation = 'relu'))

    for i in range(5):
        model.add(Dense(39, activation = 'relu'))

        model.add(Dense(39, activation = 'relu'))

    for i in range(3):
        model.add(Dense(13, activation = 'relu'))

        model.add(Dense(13, activation = 'relu'))

    model.add(Dense(1, activation = 'linear'))


    return model

nn = KerasRegressor(model=buildModel,epochs=600,callbacks=[early_stop])

parameters = {'batch_size':[30,40,50,60,70],

grid = GridSearchCV(estimator=nn,param_grid=parameters,scoring='neg_mean_absolute_error',cv=3)

grid =,y_train)


Upvotes: 2

Views: 1152

Answers (2)


Reputation: 111

Instead on using GridSearch, I suggest you to use Keras Tuner

Upvotes: 0

Adrien Riaux
Adrien Riaux

Reputation: 533

I'd recommend using MLPRegressor from Scikit-Learn API if you want to use GridSearchCV, as it'll be more compatible. (And maybe use RandomSearchCV if you start having a lot of hyperparameters to set).

Take also a look at the Pipeline in Scikit-Learn here.

Alternatively, you can use a framework dedicated to hyperparameters tuning like Optuna, which has good support for TensorFlow.

Upvotes: 0

Related Questions