Rita
Rita

Reputation: 31

AutoML training error: Less than 50% of rows > successfully generated examples

I am trying to build a classifier using GCP AutoML. I've successfully created the dataset, but when training I get the following error:

 Training pipeline failed with error message: Less than 50% of rows
 successfully generated examples.

This is an imbalanced classification problem, so I am optimising for AUC PRC. Also, the data split is done using a date column.


Any ideas why I am getting this error and how to solve it?

Upvotes: 1

Views: 214

Answers (1)

Jonas D
Jonas D

Reputation: 298

Hey I realise this is a very old question, but for me I think the problem was caused by the timestamp column (used for splitting the data) having more than one format (e.g. some looked like 2022-04-16 07:32:25.810000 UTC and some like 2022-04-29 22:20:05 UTC) in my source BigQuery table.

Truncating the timestamps to be consistent (like the below in BigQuery) fixed the issue.

SELECT
  ...
  TIMESTAMP_TRUNC(timestamp, MINUTE) as timestamp
  ...
FROM ...

Upvotes: 0

Related Questions