Reputation: 1063
I have created a Composer environment with image version -> composer-2.0.13-airflow-2.2.5
when i try to install software using PyPi, it fails. details below :
Command :
gcloud composer environments update $AIRFLOW --location us-east1 --update-pypi-packages-from-file requirements.txt
requirement.txt
---------------
google-api-core
google-auth
google-auth-oauthlib
google-cloud-bigquery
google-cloud-core
google-cloud-storage
google-crc32c
google-resumable-media
googleapis-common-protos
google-endpoints
joblib
json5
jsonschema
pandas
requests
requests-oauthlib
Error :
Karans-MacBook-Pro:composer_dags karanalang$ gcloud composer environments update $AIRFLOW --location us-east1 --update-pypi-packages-from-file requirements.txt
Waiting for [projects/versa-sml-googl/locations/us-east1/environments/versa-composer3] to be updated with [projects/versa-sml-googl/locations/us-east1/operations/c23b77a9-f46b-4222-bafd-62527bf27239]..
.failed.
ERROR: (gcloud.composer.environments.update) Error updating [projects/versa-sml-googl/locations/us-east1/environments/versa-composer3]: Operation [projects/versa-sml-googl/locations/us-east1/operations/c23b77a9-f46b-4222-bafd-62527bf27239] failed: Failed to install PyPI packages. looker-sdk 22.4.0 has requirement attrs>=20.1.0; python_version >= "3.7", but you have attrs 17.4.0.
Check the Cloud Build log at https://console.cloud.google.com/cloud-build/builds/60ac972a-8f5e-4b4f-a4a7-d81049fb19a3?project=939354532596 for details. For detailed instructions see https://cloud.google.com/composer/docs/troubleshooting-package-installation
Pls note: I have an older Composer cluster (Composer version - 1.16.8, Airflow version - 1.10.15), where the above command works fine. However, it is not working with the new cluster
What needs to be done to debug/fix this ?
tia!
Upvotes: 0
Views: 960
Reputation: 1063
I was able to get this working using the following code :
path = "gs://dataproc-spark-configs/pip_install.sh"
CLUSTER_GENERATOR_CONFIG = ClusterGenerator(
project_id=PROJECT_ID,
zone="us-east1-b",
master_machine_type="n1-standard-4",
worker_machine_type="n1-standard-4",
num_workers=4,
storage_bucket="dataproc-spark-logs",
init_actions_uris=[path],
metadata={'PIP_PACKAGES': 'pyyaml requests pandas openpyxl kafka-python'},
).make()
with models.DAG(
'Versa-Alarm-Insights-UsingComposer2',
# Continue to run DAG twice per day
default_args=default_dag_args,
schedule_interval='0 0/12 * * *',
catchup=False,
) as dag:
create_dataproc_cluster = DataprocCreateClusterOperator(
task_id="create_dataproc_cluster",
cluster_name="versa-composer2",
region=REGION,
cluster_config=CLUSTER_GENERATOR_CONFIG
)
The earlier command which involved installing packages by reading from file was working in Composer1 (Airflow 1.x), however failing with Composer 2.x (Airflow 2.x)
Upvotes: 1
Reputation: 309
From the error, it is clear that you are running old version of attrs package.
run the below command and try
pip install attrs==20.3.0
or
pip install attrs==20.1.0
Upvotes: 0