Reputation: 15246
Dataflow for Python SDK have a --requirements_file option
, that can take a standard requirements.txt, and install it on its workers before running. Are there any restriction in using these files? Specifically can I use all pip flags (e.g. --editable_mode
or -e
) to install my local packages?
Upvotes: 1
Views: 1235
Reputation: 15246
Dataflow for Python SDK will run pip install -r requirements.txt
before starting your workload. It is important that all the items reference in the requirements file are accessible to the worker machines. Dependencies on PyPI, or some other accessible location (e.g. http) will install correctly, local packages (e.g. -e my_package
) will not because they will not be accessible by workers.
--extra_package
option would allow staging local packages in an accessible way. Instead of listing local packages in the requirements.txt
, create a tarball of the local package (e.g. my_package.tar.gz
) and use --extra_package
option to stage them.
Managing Python Pipeline Dependencies have more details on these options.
Upvotes: 2