Reputation: 41
I have created Tabular datasets in Vertex AI / Datasets based on some CSV files. However when I try to use these datasets in AutoML for training and prediction, there is no way to specify the data types of the fields. In the docs I could not find how to do the "transformations". In theory it supports the following types:
In case of BigQuery tables it is pretty obvious to get the data types as it is explicitely specified by the schema of the table. However in case of a CSV file sometimes it is not obvious to find out the type of a field and indeed in my case sometimes AutoML guesses incorrectly. Any ideas how to specify the data types explicitely for CSV files?
Upvotes: 1
Views: 921
Reputation: 570
The Google Cloud Python SDK for Vertex AI does not support transformation of column data types. Currently it can be done only through the Cloud Console.
Once the data is imported in Vertex AI datasets and when the training pipeline is created, it automatically detects and analyses the provided CSV file and gives information about various data types as shown in the image below. The transformations of data types occur after the data is imported.
If Vertex AI identifies the data type incorrectly, we can use the drop down menu to change it to the desired data type as shown in the below image. Please refer to this video for a demo on building and training models with Vertex AI.
Upvotes: 0