Data versioning of "Hello_World" tutorial

Question

i have added "versioned: true" in the "catalog.yml" file of the "hello_world" tutorial.

example_iris_data:
  type: pandas.CSVDataSet
  filepath: data/01_raw/iris.csv
  versioned: true

Then when I used "kedro run" to run the tutorial, it has error as below: "VersionNotFoundError: Did not find any versions for CSVDataSet".

May i know what is the right way for me to do versioning for the "iris.csv" file? thanks!

921kiyo · Accepted Answer

Try versioning one of the downstream outputs. For example, add this entry in your catalog.yml, and run kedro run

example_train_x:
  type: pandas.CSVDataSet
  filepath: data/02_intermediate/example_iris_data.csv
  versioned: true

And you will see example_iris.data.csv directory (not a file) under data/02_intermediate. The reason example_iris_data gives you an error is that it's the starting data and there's already iris.csv in data/01_raw so, Kedro cannot create data/01_raw/iris.csv/ directory because of the name conflict with the existing iris.csv file.

Hope this helps :)

Data versioning of "Hello_World" tutorial

Answers (2)

Related Questions

Data versioning of &quot;Hello_World&quot; tutorial

Answers (2)

Related Questions

Data versioning of "Hello_World" tutorial