solocazzimiei
solocazzimiei

Reputation: 13

trouble with jobs in Optuna with Darts

I'm trying to set up Optuna for hyperparam optimization. I have 2 main doubt/troubles I don't know if are correlated. When lunching the script with 20 or 100 trials no matter, its run but in some script version is the only lastest trial to be evaluate as traceback B in bottom, in other version obtain a objective's function return not valid as traceback A
the toy sample code :

  1. inside Optuna's objective load a df ->

    timestamp, A, B, C,.... Z t-n a,b,c,.......,z ... ....,..,... t-1 a,b,c,..... t a,b,c,......

now I take apart the B as column target; [B..Z] are past covariates. in order to enance the forecast, I create a new df_fut_covariate by copy from df just the timestamp index, spreading its in the future till t+5 so using such index for extract the # value of year, month, day... to add aside. Now I have 3 df: the target, the past covariate amd the future ones. Now train and validation: I split both target and past df to obtain target_train, target_val and past_cov_train , past_cov_val. future covariate remain the same. I converted the all 5 df in timeseries tensor. Setup of

model = TSMixerModel(
                **common,
                hidden_dim=model_params['hidden_dim'],
                num_blocks=model_params['num_blocks'],
            )
common = {
                "input_chunk_length" :  model_params['input_chunk_length'],
                "output_chunk_length" : conf['param']['n_future'],
                "n_epochs" :            conf['param']['epoc'],
                "optimizer_cls" :       torch.optim.AdamW,
                "optimizer_kwargs" :    {'lr': model_params['learning_rate'] },
                "pl_trainer_kwargs" :   {"callbacks": [early_stopping, checkpoint], "enable_checkpointing": True},

and finally

model.fit(
                series =                    train_target_list,
                past_covariates =           train_past_cov_list,
                val_series =                val_target_list,
                val_past_covariates =       val_past_cov_list,
                max_samples_per_ts =        d_dataset
            )
                "force_reset" :         True,
                "dropout" :             model_params['dropout'],
                "batch_size" :          model_params['batch_size'],
                "random_state" :        42,
            }

closing the objective

pred_cov = model.predict(
                n =                     conf['param']['n_future'],
                series=                 val_target_list,
                past_covariates=        val_past_cov_list,
                future_covariates=      time_fut_cov_list,
                max_samples_per_ts =    d_dataset ,
            )
        
        mse_v = np.mean(mse(target_chk_list,pred_cov))
  1. Optuna Setup & execution

    study = optuna.create_study(direction="minimize") study.optimize(partial_objective, n_trials, n_jobs = n_jobs)

A.TRACEBACK:

[I 2025-02-04 16:27:16,712] A new study created in memory with name: no-name-40d2d4fb-84a3-46bf-adc3-2ce951a31e04
Errore durante l'esecuzione del trial: type object 'TiDEModel' has no attribute '_model_call'
Errore durante l'esecuzione del trial: 'TiDEModel' object has no attribute '_model_call'
Errore durante l'esecuzione del trial: 'TiDEModel' object has no attribute '_model_call'
[I 2025-02-04 16:27:16,782] Trial 3 finished with value: inf and parameters: {'hidden_size': 368, 'num_encoder_layers': 24, 'num_decoder_layers': 14, 'input_chunk_length': 340, 'learning_rate': 0.000154431866048337, 'batch_size': 145, 'dropout': 0.36806900421609556}. Best is trial 3 with value: inf.
Errore durante l'esecuzione del trial: 'TiDEModel' object has no attribute '_model_call'
[I 2025-02-04 16:27:16,784] Trial 1 finished with value: inf and parameters: {'hidden_size': 258, 'num_encoder_layers': 7, 'num_decoder_layers': 24, 'input_chunk_length': 235, 'learning_rate': 0.0001880579274638445, 'batch_size': 281, 'dropout': 0.09468542531263935}. Best is trial 3 with value: inf.
Errore durante l'esecuzione del trial: 'TiDEModel' object has no attribute '_model_call'
[I 2025-02-04 16:27:16,785] Trial 2 finished with value: inf and parameters: {'hidden_size': 450, 'num_encoder_layers': 18, 'num_decoder_layers': 15, 'input_chunk_length': 222, 'learning_rate': 2.4736475027811698e-05, 'batch_size': 107, 'dropout': 0.03371359556948311}. Best is trial 3 with value: inf.
[I 2025-02-04 16:27:16,787] Trial 4 finished with value: inf and parameters: {'hidden_size': 528, 'num_encoder_layers': 9, 'num_decoder_layers': 23, 'input_chunk_length': 229, 'learning_rate': 0.001291641109184171, 'batch_size': 160, 'dropout': 0.31978142700074424}. Best is trial 3 with value: inf.
[I 2025-02-04 16:27:16,790] Trial 5 finished with value: inf and parameters: {'hidden_size': 19, 'num_encoder_layers': 17, 'num_decoder_layers': 13, 'input_chunk_length': 179, 'learning_rate': 0.0004787763019929115, 'batch_size': 185, 'dropout': 0.43286768387686514}. Best is trial 3 with value: inf.
Errore durante l'esecuzione del trial: 'TiDEModel' object has no attribute '_model_call'
Errore durante l'esecuzione del trial: 'TiDEModel' object has no attribute '_model_call'
[I 2025-02-04 16:27:16,813] Trial 6 finished with value: inf and parameters: {'hidden_size': 415, 'num_encoder_layers': 26, 'num_decoder_layers': 14, .............. Best is trial 3 with value: inf.
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
You are using a CUDA device ('NVIDIA GeForce RTX 3060') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
You are using a CUDA device ('NVIDIA GeForce RTX 3060') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
You are using a CUDA device ('NVIDIA GeForce RTX 3060') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

   | Name                  | Type             | Params
------------------------------------------------------------
0  | criterion             | MSELoss          | 0
1  | train_criterion       | MSELoss          | 0
2  | val_criterion         | MSELoss          | 0
3  | train_metrics         | MetricCollection | 0
4  | val_metrics           | MetricCollection | 0
5  | past_cov_projection   | _ResidualBlock   | 2.7 K
6  | future_cov_projection | _ResidualBlock   | 2.9 K
7  | encoders              | Sequential       | 1.6 M
8  | decoders              | Sequential       | 2.7 M
9  | temporal_decoder      | _ResidualBlock   | 726
10 | lookback_skip         | Linear           | 470
------------------------------------------------------------
4.2 M     Trainable params
0         Non-trainable params
4.2 M     Total params
16.998    Total estimated model params size (MB)
Sanity Checking DataLoader 0:   0%|                                                              | 0/2 [00:00<?, ?it/s]

............ Sanity Checking DataLoader 0: 50%|███████████████████████████ | 1/2 [00:00<00:00, 4.23it/s] | Name | Type | Params ------------------------------------------------------------ 0 | criterion | MSELoss | 0 1 | train_criterion | MSELoss | 0 2 | val_criterion | MSELoss | 0 3 | train_metrics | MetricCollection | 0 4 | val_metrics | MetricCollection | 0 5 | past_cov_projection | _ResidualBlock | 2.6 K 6 | future_cov_projection | _ResidualBlock | 2.9 K 7 | encoders | Sequential | 6.6 M 8 | decoders | Sequential | 4.0 M 9 | temporal_decoder | _ResidualBlock | 726 10 | lookback_skip | Linear | 1.6 K ------------------------------------------------------------ 10.6 M Trainable params 0 Non-trainable params 10.6 M Total params 42.555 Total estimated model params size (MB) | 0/839 [09:09<?, ?it/s] Epoch 0: 40%|███████████████████▊ | 746/1847 [16:09<23:51, 0.77it/s, train_loss=0.000706]Metric val_loss improved. New best score: 1790449886631968344818121956857276691909715897480148144277677017529282478119415803155216895883218483330690391540006231416796847538176.000 Epoch 0: 50%|████████████████████████▍ | 919/1847 [20:18<20:30, 0.75it/s, train_loss=0.000485]Metric val_loss improved. New best score: 0.075 Epoch 0: 64%|████████████████████████████████▉ | 1492/2314 [30:36<16:51, 0.81it/s, train_loss=0.096]Metric val_loss improved by 1752613978848462301808567874045012061055402658941672252434059408352941154736309937458785829259582895466642233507877786551481939263488.000 >= min_delta = 0.0002. New best score: 37835907783506062710557181009504236993833288610282794383487244409064657357252567819291952372240893571181285474586265268628903428096.000 Epoch 0: 100%|█

B.TRACEBACK

[W 2025-02-04 16:23:44,568] Trial 5 failed with parameters: {'num_blocks': 25, 'num_layers': 14, 'layer_widths': 199, 'input_chunk_length': 24, 'learning_rate': 0.0004479001778431244, 'batch_size': 23, 'dropout': 0.3068416435577809} because of the following error: AttributeError("type object 'NHiTSModel' has no attribute '_model_call'").
Traceback (most recent call last):
  File "C:\Program Files\Linguaggi\Python\WPy64-31241\python-3.12.4.amd64\Lib\site-packages\optuna\study\_optimize.py", line 196, in _run_trial
    value_or_values = func(trial)
                      ^^^^^^^^^^^
  File "N:\Develope\Python\AI\Previsione Titoli 14\Optina Final_OK.py", line 297, in objective
    model = NHiTSModel(
            ^^^^^^^^^^^
  File "C:\Program Files\Linguaggi\Python\WPy64-31241\python-3.12.4.amd64\Lib\site-packages\darts\models\forecasting\forecasting_model.py", line 115, in __call__
    return super().__call__(**all_params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Linguaggi\Python\WPy64-31241\python-3.12.4.amd64\Lib\site-packages\darts\models\forecasting\nhits.py", line 716, in __init__
    super().__init__(**self._extract_torch_model_params(**self.model_params))
  File "C:\Program Files\Linguaggi\Python\WPy64-31241\python-3.12.4.amd64\Lib\site-packages\darts\utils\torch.py", line 103, in decorator
    return decorated(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Linguaggi\Python\WPy64-31241\python-3.12.4.amd64\Lib\site-packages\darts\models\forecasting\torch_forecasting_model.py", line 293, in __init__
    super().__init__(add_encoders=add_encoders)
  File "C:\Program Files\Linguaggi\Python\WPy64-31241\python-3.12.4.amd64\Lib\site-packages\darts\models\forecasting\forecasting_model.py", line 2716, in __init__
    super().__init__(add_encoders=add_encoders)
  File "C:\Program Files\Linguaggi\Python\WPy64-31241\python-3.12.4.amd64\Lib\site-packages\darts\models\forecasting\forecasting_model.py", line 143, in __init__
    self._model_params = self._extract_model_creation_params()
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Linguaggi\Python\WPy64-31241\python-3.12.4.amd64\Lib\site-packages\darts\models\forecasting\forecasting_model.py", line 2429, in _extract_model_creation_params
    del self.__class__._model_call
        ^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: type object 'NHiTSModel' has no attribute '_model_call'
[W 2025-02-04 16:23:44,570] Trial 1 failed with parameters: {'num_blocks': 25, 'num_layers': 22, 'layer_widths': 105, 'input_chunk_length': 100, 'learning_rate': 3.705626078507646e-05, 'batch_size': 234, 'dropout': 0.23082064991883505} because of the following error: AttributeError("'NHiTSModel' object has no attribute '_model_call'").
Traceback (most recent call last):
  File "C:\Program Files\Linguaggi\Python\WPy64-31241\python-3.12.4.amd64\Lib\site-packages\optuna\study\_optimize.py", line 196, in _run_trial
    value_or_values = func(trial)
                      ^^^^^^^^^^^
  File "N:\Develope\Python\AI\Previsione Titoli 14\Optina Final_OK.py", line 297, in objective
    model = NHiTSModel(
            ^^^^^^^^^^^
  File "C:\Program Files\Linguaggi\Python\WPy64-31241\python-3.12.4.amd64\Lib\site-packages\darts\models\forecasting\forecasting_model.py", line 115, in __call__
    return super().__call__(**all_params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

..................... [W 2025-02-04 16:23:44,571] Trial 3 failed with parameters: {'num_blocks': 19, 'num_layers': 10, 'layer_widths': 196, 'input_chunk_length': 317, 'learning_rate': 6.229941287864964e-05, 'batch_size': 222, 'dropout': 0.189537895170228} because of the following error: AttributeError("'NHiTSModel' object has no attribute '_model_call'"). Traceback (most recent call last): File "C:\Program Files\Linguaggi\Python\WPy64-31241\python-3.12.4.amd64\Lib\site-packages\optuna\study_optimize.py", line 196, in _run_trial value_or_values = func(trial) ^^^^^^^^^^^ File "N:\Develope\Python\AI\Previsione Titoli 14\Optina Final_OK.py", line 297, in objective model = NHiTSModel( ^^^^^^^^^^^ File "C:\Program Files\Linguaggi\Python\WPy64-31241\python-3.12.4.amd64\Lib\site-packages\darts\models\forecasting\forecasting_model.py", line 115, in call return super().call(**all_params) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Program Files\Linguaggi\Python\WPy64-31241\python-3.12.4.amd64\Lib\site-packages\darts\models\forecasting\nhits.py", line 716, in init super().init(**self._extract_torch_model_params(**self.model_params)) File "C:\Program Files\Linguaggi\Python\WPy64-31241\python-3.12.4.amd64\Lib\site-packages\darts\utils\torch.py", line 103, in decorator return decorated(self, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Program Files\Linguaggi\Python\WPy64-31241\python-3.12.4.amd64\Lib\site-packages\darts\models\forecasting\torch_forecasting_model.py", line 293, in init super().init(add_encoders=add_encoders) File "C:\Program Files\Linguaggi\Python\WPy64-31241\python-3.12.4.amd64\Lib\site-packages\darts\models\forecasting\forecasting_model.py", line 2716, in init super().init(add_encoders=add_encoders) File "C:\Program Files\Linguaggi\Python\WPy64-31241\python-3.12.4.amd64\Lib\site-packages\darts\models\forecasting\forecasting_model.py", line 143, in init self._model_params = self._extract_model_creation_params() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Program Files\Linguaggi\Python\WPy64-31241\python-3.12.4.amd64\Lib\site-packages\darts\models\forecasting\forecasting_model.py", line 2428, in _extract_model_creation_params model_params = copy.deepcopy(self._model_call) ^^^^^^^^^^^^^^^^ .......... File "C:\Program Files\Linguaggi\Python\WPy64-31241\python-3.12.4.amd64\Lib\site-packages\darts\models\forecasting\forecasting_model.py", line 2429, in _extract_model_creation_params del self.class._model_call ^^^^^^^^^^^^^^^^^^^^^^^^^^ AttributeError: type object 'NHiTSModel' has no attribute '_model_call' [W 2025-02-04 16:23:44,573] Trial 5 failed with value None. [W 2025-02-04 16:23:44,578] Trial 1 failed with value None. [W 2025-02-04 16:23:44,582] Trial 2 failed with value None. [W 2025-02-04 16:23:44,587] Trial 3 failed with value None. [W 2025-02-04 16:23:44,590] Trial 4 failed with value None. GPU available: True (cuda), used: True TPU available: False, using: 0 TPU cores IPU available: False, using: 0 IPUs HPU available: False, using: 0 HPUs You are using a CUDA device ('NVIDIA GeForce RTX 3060') that has Tensor Cores. To properly utilize them, you should set torch.set_float32_matmul_precision('medium' | 'high') which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name            | Type             | Params
-----------------------------------------------------
0 | criterion       | MSELoss          | 0
1 | train_criterion | MSELoss          | 0
2 | val_criterion   | MSELoss          | 0
3 | train_metrics   | MetricCollection | 0
4 | val_metrics     | MetricCollection | 0
5 | stacks          | ModuleList       | 29.0 M
-----------------------------------------------------
28.6 M    Trainable params
409 K     Non-trainable params
29.0 M    Total params
116.133   Total estimated model params size (MB)
Epoch 0: 100%|█████████████████████████████████████| 195/195 [01:18<00:00,  2.47it/s, train_loss=24.90, val_loss=6.630]Metric val_loss improved. New best score: 6.634
Epoch 0, global step 195: 'val_loss' reached 6.63379 (best 6.63379), saving model to 'N:\\Develope\\Python\\AI\\Previsione Titoli 14\\Weight-Models\\europa§AS_0_NHiTSModel_3000.pt.ckpt' as top 1
Epoch 1: 100%|█████████████████████████████████████| 195/195 [01:23<00:00,  2.32it/s, train_loss=21.90, val_loss=5.570]Metric val_loss improved by 1.059 >= min_delta = 0.0002. New best score: 5.574
Epoch 1, global step 390: 'val_loss' reached 5.57436 (best 5.57436), saving model to 'N:\\Develope\\Python\\AI\\Previsione Titoli 14\\Weight-Models\\europa§AS_0_NHiTSModel_3000.pt.ckpt' as top 1
Epoch 2: 100%|█████████████████████████████████████| 195/195 [02:01<00:00,  1.60it/s, train_loss=15.80, val_loss=5.150]Metric val_loss improved by 0.423 >= min_delta = 0.0002. New best score: 5.151
Epoch 2, global step 585: 'val_loss' reached 5.15107 (best 5.15107), saving model to 'N:\\Develope\\Python\\AI\\Previsione Titoli 14\\Weight-Models\\europa§AS_0_NHiTSModel_3000.pt.ckpt' as top 1
Epoch 3: 100%|█████████████████████████████████████| 195/195 [03:17<00:00,  0.99it/s, train_loss=7.220, val_loss=4.340]Metric val_loss improved by 0.815 >= min_delta = 0.0002. New best score: 4.336
Epoch 3, global step 780: 'val_loss' reached 4.33560 (best 4.33560), saving model to 'N:\\Develope\\Python\\AI\\Previsione Titoli 14\\Weight-Models\\europa§AS_0_NHiTSModel_3000.pt.ckpt' as top 1
Epoch 4: 100%|█████████████████████████████████████| 195/195 [03:34<00:00,  0.91it/s, trai

Upvotes: 0

Views: 20

Answers (0)

Related Questions