Reputation: 61
I want to train a N-Beats time series model using Darts. I have a time serie DataFrame for each users so I want to use Multiple-Series training but when I feed the list of TimeSeries I directly get NaN as losses during training. If I concatenate all users's TimeSeries into one, I get a normal loss. In both cases the data is scale, fill and cast to float.32
data = scaler.transform(filler.transform(data)).astype(np.float32)
Here is the code that I use combine the list of TimeSeries into a single TimeSeries. I also have a pure Darts code for that but it is much slower for the same result.
SPLIT = 0.8
if concatenate_to_one_ts:
all_dfs = []
all_dfs_cov = []
for i in range(len(list_of_target_ts)):
all_dfs.append(list_of_target_ts[i].pd_series())
all_dfs_cov.append(list_of_cov_ts[i].pd_dataframe())
all_dfs = pd.concat(all_dfs)
all_dfs_cov = pd.concat(all_dfs_cov)
nbr_train_sample = int(len(all_dfs) * SPLIT)
all_dfs_train = all_dfs[:nbr_train_sample]
all_dfs_test = all_dfs[nbr_train_sample:]
list_of_target_ts_train = TimeSeries.from_series(all_dfs_train.reset_index(drop=True))
list_of_target_ts_test = TimeSeries.from_series(all_dfs_test.reset_index(drop=True))
all_dfs_cov_train = all_dfs_cov[:nbr_train_sample]
all_dfs_cov_test = all_dfs_cov[nbr_train_sample:]
list_of_cov_ts_train = TimeSeries.from_dataframe(all_dfs_cov_train.reset_index(drop=True))
list_of_cov_ts_test = TimeSeries.from_dataframe(all_dfs_cov_test.reset_index(drop=True))
else:
nbr_train_sample = int(len(list_of_target_ts) * SPLIT)
list_of_target_ts_train = list_of_target_ts[:nbr_train_sample]
list_of_target_ts_test = list_of_target_ts[nbr_train_sample:]
list_of_cov_ts_train = list_of_cov_ts[:nbr_train_sample]
list_of_cov_ts_test = list_of_cov_ts[nbr_train_sample:]
model = NBEATSModel(input_chunk_length=4,
output_chunk_length=1,
batch_size=512,
n_epochs=5,
nr_epochs_val_period=1,
model_name="NBEATS_test",
generic_architecture=True,
force_reset=True,
save_checkpoints=True,
show_warnings=True,
log_tensorboard=True,
torch_device_str='cuda:0'
)
model.fit(series=list_of_target_ts_train,
past_covariates=list_of_cov_ts_train,
val_series=list_of_target_ts_val,
val_past_covariates=list_of_cov_ts_val,
verbose=True,
num_loader_workers=20)
As Multiple-Series training I get:
Epoch 0: 8%|██████████▉ | 2250/27807 [03:00<34:11, 12.46it/s, loss=nan, v_num=logs, train_loss=nan.0
As a single serie training I get:
Epoch 0: 24%|█████████████████████████▋ | 669/2783 [01:04<03:24, 10.33it/s, loss=0.00758, v_num=logs, train_loss=0.00875]
I am also confused by the number of sample per epoch with the same batch size as from what I read here: https://unit8.com/resources/training-forecasting-models/ the single serie should have more sample as the window size cut is not happening for each Multiple Series.
Upvotes: 2
Views: 1356
Reputation: 61
Thanks Julien Herzen your answer helped me a lot to find the issue. I want to add more details on what was happening.
getting the length of the longest sub series and multiplying by the number of series.
self.max_samples_per_ts = (max(len(ts) for ts in self.target_series) - self.size_of_both_chunks + 1)
then when getitem is called
target_idx = idx // self.max_samples_per_ts
target_series = self.target_series[target_idx]
It select a series by dividing the idx by the max number of samples thus shorter series will be sampled more than longer one as they have less data but same chance to get sampled.
Here is my smallest example with input_chunk_length = 4 and output = 1: Multi series with lengths: [71, 19] -> number of samples (71 * 2) - (2 * input_chunk_length) = 134
Concantenated into a Single serie: 90 -> number of samples: 90 - input_chunk_length = 86
In the multi series the sample in the short sub series will likely be sampled more time.
Upvotes: 1
Reputation: 171
Upvotes: 3