João Sequeira
João Sequeira

Reputation: 67

Large files for GitHub CICD

I have a GitHub repo of a pipeline that requires very large files as input (basic test datasets would be around 1-2 Gb).

I thought about circunventing this by doing CICD locally, but this will not allow the CICD to run if other people want to contribute to the repo right?

Is there any workflow that allows for complex CICD with large datasets, while also enabling pull requests CICD?

Upvotes: 5

Views: 1016

Answers (1)

Vimal Patel
Vimal Patel

Reputation: 119

Use External Storage for Large Files in above scenario

Store Large Files Externally possibly into AWS S3/Google Drive / GCS Azure Blob Storage / GitHub Releases (for versioned datasets)

Modify CI/CD Workflow to Download Data from these external sources

In GitHub Actions, use wget or aws s3 cp to fetch the dataset before running tests.

Upvotes: 0

Related Questions