PowellHall
PowellHall

Reputation: 17

Error: gs://bucket/SampleVideo.mp4 cannot be parsed as CSV

I’m trying to create a custom model on Auto ML Video Intelligence Classification, and keep getting the above error. This isn’t the issue with one video clip, since I get the error message with different videos if I try to use the same CSV. The CSV was created in excel, first column being TRAIN or TEST, second column being the video URI.

Upvotes: 0

Views: 471

Answers (2)

user
user

Reputation: 1

I got the same error and by doing this it got solved for me. While your saving the CSV file check whether have selected Encoding as "UTF-8" or not. If not save it as "UTF-8".

Upvotes: 0

rmesteves
rmesteves

Reputation: 4085

According to the documentation, you need to create two level of CSV's:

  1. An initial csv that will point to the TRAIN and TEST data csv's
  2. The csv's referenced by the initial one. Here we have a csv for TRAIN and another for TEST and the rows of these csv's reference the videos.

Lets take a look more deeply:

1. Your first csv must be similar to the ones below:

TRAIN,gs://automl-video-demo-data/hmdb_split1_5classes_train.csv
TEST,gs://automl-video-demo-data/hmdb_split1_5classes_test.csv

Or

UNASSIGNED,gs://automl-video-demo-data/hmdb_split1_5classes_all.csv

Where:

  1. Training data: Used to train the model. Contains paths to video files, start and end times for video segments, and labels identifying the subject of the video segment. If you specify a training data CSV file, you must also specify a testing data CSV file.
  2. Test data: Used for testing the model during the training phase. Contains paths to video files, start and end times for video segments, and labels identifying the subject of the video segment. If you specify a testing data CSV file, you must also specify a training data CSV file.
  3. Unassigned data: sed for both training and testing the model. Contains paths to video files, start and end times for video segments, and labels identifying the subject of the video segment. Rows in the unassigned file are automatically divided into train and test data. 80% for training and 20% for testing. You can specify only an unassigned data CSV file without training and testing data CSV files. You can also specify only the training and testing data CSV files without an unassigned data CSV file.


2. Your TRAIN, TEST and UNASSIGNED files must have the following information:

  1. The content to be categorized or annotated. This field contains Google Cloud Storage URI for the video. Google Cloud Storage URIs are case-sensitive.

  2. A label that identifies how the video is categorized. . Labels must start with a letter and only contain letters, numbers, and underscores. You can specify multiple labels for a video by adding multiple rows in the CSV file that each identify the same video segment, with a different label for each row.

  3. Start and end time of the video segment. These two, comma-separated fields identify the start and end time of the video segment to analyze, in seconds. The start time must be less than the end time. Both values must be non-negative and within the time range of the video. For example, 0.09845,1.3600555. To use the entire content of the video, specify a start time of 0 and an end time of the full length of the video or "inf". For example, 0,inf.

Example of a row in your file:

gs://<your-video-path>/vehicle.mp4,mustang,0,5.4

Where:

  • gs:///vehicle.mp4 is the path to the video
  • mustang is the tag
  • 0 is the start time in the video
  • 5.4 is the end time in the video

Upvotes: 1

Related Questions