Best practice to manage dependencies between conda and pip

Question

I'm developing a Python library, which depends on multiple packages. I'm struggling to find the most straightforward of managing all those dependencies with the following constraints :

Some of those dependencies are only available as conda packages (technically the source is available but the build process is not something I want to get into)
Other dependencies are only available via pip
I need to install my own library in editable or developer mode
I regularly need to keep the dependencies up-to-date

My current setup for the initial install :

Create a new conda environment
Install the conda-only dependencies with conda install ...
Install my library with pip install -e .

At this point, some packages were installed and are now managed by conda, others by pip. When I want to update my environment, I need to:

Update the conda part of the environment with conda update --all
Update the pip part of the environment by hand

My problem is that this is unstable : when I update all conda packages, it ensures the consistency of the packages it manages. However, I can't guarantee that the environment as a whole stays consistent, and I just realized that I was missing some updates because I forgot to check for updates in the pip part of the environment.

What's the best way to do this ? I've thought of :

Using conda's pip interoperability feature : this seems to work, but I've had some dubious results, probably because of my use of extras_require
Since pip can see the conda packages, the initial install is consistent, which means I can simply reinstall everything when I want to update. This works but is not exactly elegant.

merv · Accepted Answer

The recommendation in the official documentation for managing a Conda environment that also requires PyPI-sourced or pip-installed local packages is to define all dependencies (both Conda and Pip) in a YAML file. Something like:

env.yaml

name: my_env
channels:
 - defaults
dependencies:
 - python=3.8
 - numpy
 - pip
 - pip:
   - some_pypi_only_pkg
   - -e path/to/a/local/pkg

The workflow for updating in such an environment is to update the YAML file (which I would recommend to keep under version control) and then either create a new environment or use

conda env update -f env.yaml

Personally, I would tend to create new envs, rather than mutate (update) an existing one, and use minimal constraints (i.e., >=version) in the YAML. When creating a new env, it should automatically pull the latest consistent packages. Plus, one can keep the previous instances of the env around in case a regression is need during the development lifecycle.

Best practice to manage dependencies between conda and pip

Answers (1)

Related Questions