Reputation: 431
I am currently running Google Cloud Composer with a Composer version 2.0.9
and airflow version 2.1.4
. I am trying install the most recent version of dbt (1.0.4
for core and 1.0.0
for the BigQuery plugin). Because cloud composter images has specific packages installed, I am getting conflicting PyPI dependency issues. When I try to fix one dependency another issue occurs. Does anyone know the specific set of packages installed that would resolve this issue? I have read the following posts by the community but I wanted to know if anyone has a solution for just using composer?
How to run DBT in airflow without copying our repo
How to set up dbt with Google Cloud Composer?
Upvotes: 3
Views: 1299
Reputation: 431
As mentioned by @Kabilan Mohanraj, the current version of dbt (1.0.4) and a more recent version of Composer has dependency issues (Composer version 2.0.9 and Airflow version 2.1.4). Therefore an alternative solution is needed. In my case, I played around and searched for a solution from other people in the community and found one person using a certain version of Composer and dbt that only had mimimal dependency issues. However, as mentioned by @Kabilan Mohanraj, Google does not recommend downgrading preinstalled packages, so this would not be a viable solution for something in production.
create composer through gcloud to use an older version that is not available via the Composer UI
gcloud composer environments create my_airflow_dbt_example
--location us-central1
--image-version composer-1.17.9-airflow-2.1.4
requirements
dbt-bigquery==0.21.0
jsonschema==3.1.1
packaging==20.9
For this specific composer version, you are downgrading jsonschema
from 3.2.0
to 3.1.1
and packaging
from 21.3
to 20.9
Upvotes: 2
Reputation: 1906
I was able to reproduce the behaviour you are seeing. Below are the dependency conflicts I saw in the Cloud Build logs. These conflicts are occurring between the dbt-core
requirements and the pre-installed package requirements in Composer.
Pre-installed package requirements:
hologram 0.0.14 has requirement jsonschema<3.2,>=3.0, but you have jsonschema 3.2.0. ##=> can be installed manually
flask 1.1.4 has requirement click<8.0,>=5.1, but you have click 8.1.2.
apache-airflow 2.1.4+composer has requirement markupsafe<2.0,>=1.1.1, but you have markupsafe 2.0.1.
looker-sdk 22.4.0 has requirement typing-extensions>=4.1.1, but you have typing-extensions 3.10.0.2.
dbt-core requirements:
hologram 0.0.14 has requirement jsonschema<3.2,>=3.0, but you have jsonschema 3.2.0. ##=> can be installed manually
dbt-core 1.0.4 has requirement click<9,>=8, but you have click 7.1.2.
dbt-core 1.0.4 has requirement MarkupSafe==2.0.1, but you have markupsafe 1.1.1.
dbt-core 1.0.4 has requirement typing-extensions<3.11,>=3.7.4, but you have typing-extensions 4.1.1.
I tried downgrading the pre-installed packages, but subsequent package installations fail and it is not recommended as well.
Therefore, I would suggest using an external solution as stated in this thread you have linked. Quoting the workarounds given in @Ryan Yuan's answer here.
- Using external services to run dbt jobs, e.g. Cloud Run.
- Using Composer's KubernetesPodOperator(updated Composer 2 link). My colleague has put up a nice article on dbt discourse here going through the setup process.
- Ignoring Composer's Dependency conflicts by setting Composer's environmental variable IGNORE_PYPI_DEPENDENCY_CONFLICTS to True. However, I don't recommend this as it may cause potential issues.
- Creating a Python virtual environment in Composer and install the dbt packages.
Upvotes: 3