Virendra
Virendra

Reputation: 2553

Airflow unable to mount a google cloud bucket using gcsfuse

I want to mount a Google Cloud Bucket into my airflow environment so that I can read and write files on that GCS Bucket. I am using Cloud Composer 2 (composer-2.1.14-airflow-2.5.1 image)

In airflow I created a DAG to run the following bash script

#!/bin/bash

BUCKET="my-bucket"
MOUNT_DIR="/home/airflow/gcs/data/my-bucket"

#Creating $MOUNT_DIR directory & granting it permissions
mkdir -p $MOUNT_DIR
sudo chmod g+w $MOUNT_DIR

# Mounting GCS Bucket
gcsfuse --foreground --debug_fuse --debug_fs --debug_gcs --debug_http -o nonempty $BUCKET $MOUNT_DIR

Here are the logs from Airflow:

[2023-09-20, 11:46:39 PDT] {subprocess.py:93} INFO - Start gcsfuse/0.42.3 (Go version go1.19.5) for app "" using mount point: /home/airflow/gcs/data/my-bucket
[2023-09-20, 11:46:39 PDT] {subprocess.py:93} INFO - Opening GCS connection...
[2023-09-20, 11:46:39 PDT] {subprocess.py:93} INFO - Creating a mount at "/home/airflow/gcs/data/my-bucket"
[2023-09-20, 11:46:39 PDT] {subprocess.py:93} INFO - Creating a new server...
[2023-09-20, 11:46:39 PDT] {subprocess.py:93} INFO - Set up root directory for bucket my-bucket
[2023-09-20, 11:46:39 PDT] {subprocess.py:93} INFO - gcs: Req              0x0: <- ListObjects("")
[2023-09-20, 11:46:39 PDT] {subprocess.py:93} INFO - gcs: Req              0x0: -> ListObjects("") (131.395831ms): OK
[2023-09-20, 11:46:39 PDT] {subprocess.py:93} INFO - Mounting file system "my-bucket"...
[2023-09-20, 11:46:39 PDT] {subprocess.py:93} INFO - fuse_debug: Beginning the mounting kickoff process
[2023-09-20, 11:46:39 PDT] {subprocess.py:93} INFO - fuse_debug: Parsing fuse file descriptor
[2023-09-20, 11:46:39 PDT] {subprocess.py:93} INFO - fuse_debug: Preparing for direct mounting
[2023-09-20, 11:46:39 PDT] {subprocess.py:93} INFO - fuse_debug: Directmount failed. Trying fallback.
[2023-09-20, 11:46:39 PDT] {subprocess.py:93} INFO - fuse_debug: Creating a socket pair
[2023-09-20, 11:46:39 PDT] {subprocess.py:93} INFO - fuse_debug: Creating files to wrap the sockets
[2023-09-20, 11:46:39 PDT] {subprocess.py:93} INFO - fuse_debug: Starting fusermount/os mount
[2023-09-20, 11:46:39 PDT] {subprocess.py:93} INFO - /usr/bin/fusermount: fuse device not found, try 'modprobe fuse' first
[2023-09-20, 11:46:39 PDT] {subprocess.py:93} INFO - Error while mounting gcsfuse: mountWithConn: Mount: mount: running /usr/bin/fusermount: exit status 1
[2023-09-20, 11:46:39 PDT] {subprocess.py:93} INFO - mountWithArgs: mountWithConn: Mount: mount: running /usr/bin/fusermount: exit status 1

I already verified that Airflow can access the bucket by running the following command and I see the list of files in the bucket:

gsutil ls gs://$BUCKET

I even tried running the following command and I still get same error as above:

sudo mount -t gcsfuse -o rw,user $BUCKET $MOUNT_DIR

I have referenced the following and a few other pages but I am still not able to mount it:

Update: I updated the composer environment to composer-2.4.2-airflow-2.5.3 and I still see the following error:

[2023-09-20, 16:55:38 PDT] {subprocess.py:93} INFO - {"name":"root","levelname":"INFO","severity":"INFO","message":"Start gcsfuse/1.0.1 (Go version go1.20.5) for app \"\" using mount point:/home/airflow/gcs/data/my-bucket\n","timestampSeconds":1695254138,"timestampNanos":83062812}
[2023-09-20, 16:55:38 PDT] {subprocess.py:93} INFO - {"name":"root","levelname":"INFO","severity":"INFO","message":"Opening GCS connection...\n","timestampSeconds":1695254138,"timestampNanos":83799366}
[2023-09-20, 16:55:38 PDT] {subprocess.py:93} INFO - {"name":"root","levelname":"INFO","severity":"INFO","message":"Creating a mount at \"/home/airflow/gcs/data/datavant/my-bucket\"\n","timestampSeconds":1695254138,"timestampNanos":87562370}
[2023-09-20, 16:55:38 PDT] {subprocess.py:93} INFO - {"name":"root","levelname":"INFO","severity":"INFO","message":"Creating a new server...\n","timestampSeconds":1695254138,"timestampNanos":87589651}
[2023-09-20, 16:55:38 PDT] {subprocess.py:93} INFO - {"name":"root","levelname":"INFO","severity":"INFO","message":"Set up root directory for bucket my-bucket\n","timestampSeconds":1695254138,"timestampNanos":87599362}
[2023-09-20, 16:55:38 PDT] {subprocess.py:93} INFO - {"name":"root","levelname":"DEBUG","severity":"DEBUG","message":"gcs: Req              0x0: \u003c- ListObjects(\"\")\n","timestampSeconds":1695254138,"timestampNanos":87612220}
[2023-09-20, 16:55:38 PDT] {subprocess.py:93} INFO - {"name":"root","levelname":"DEBUG","severity":"DEBUG","message":"gcs: Req              0x0: -\u003e ListObjects(\"\") (106.665835ms): OK\n","timestampSeconds":1695254138,"timestampNanos":194287578}
[2023-09-20, 16:55:38 PDT] {subprocess.py:93} INFO - {"name":"root","levelname":"INFO","severity":"INFO","message":"Mounting file system \"my-bucket\"...\n","timestampSeconds":1695254138,"timestampNanos":194342795}
[2023-09-20, 16:55:38 PDT] {subprocess.py:93} INFO - {"name":"root","levelname":"DEBUG","severity":"DEBUG","message":"fuse_debug: Beginning the mounting kickoff process\n","timestampSeconds":1695254138,"timestampNanos":194916407}
[2023-09-20, 16:55:38 PDT] {subprocess.py:93} INFO - {"name":"root","levelname":"DEBUG","severity":"DEBUG","message":"fuse_debug: Parsing fuse file descriptor\n","timestampSeconds":1695254138,"timestampNanos":194977401}
[2023-09-20, 16:55:38 PDT] {subprocess.py:93} INFO - {"name":"root","levelname":"DEBUG","severity":"DEBUG","message":"fuse_debug: Preparing for direct mounting\n","timestampSeconds":1695254138,"timestampNanos":194984093}
[2023-09-20, 16:55:38 PDT] {subprocess.py:93} INFO - {"name":"root","levelname":"DEBUG","severity":"DEBUG","message":"fuse_debug: Directmount failed. Trying fallback.\n","timestampSeconds":1695254138,"timestampNanos":195003380}
[2023-09-20, 16:55:38 PDT] {subprocess.py:93} INFO - {"name":"root","levelname":"DEBUG","severity":"DEBUG","message":"fuse_debug: Creating a socket pair\n","timestampSeconds":1695254138,"timestampNanos":195238613}
[2023-09-20, 16:55:38 PDT] {subprocess.py:93} INFO - {"name":"root","levelname":"DEBUG","severity":"DEBUG","message":"fuse_debug: Creating files to wrap the sockets\n","timestampSeconds":1695254138,"timestampNanos":195260643}
[2023-09-20, 16:55:38 PDT] {subprocess.py:93} INFO - {"name":"root","levelname":"DEBUG","severity":"DEBUG","message":"fuse_debug: Starting fusermount/os mount\n","timestampSeconds":1695254138,"timestampNanos":195270306}
[2023-09-20, 16:55:38 PDT] {subprocess.py:93} INFO - /usr/bin/fusermount: fuse device not found, try 'modprobe fuse' first
[2023-09-20, 16:55:38 PDT] {subprocess.py:93} INFO - {"name":"root","levelname":"INFO","severity":"INFO","message":"Error while mounting gcsfuse: mountWithConn: Mount: mount: running /usr/bin/fusermount: exit status 1\n","timestampSeconds":1695254138,"timestampNanos":198067902}
[2023-09-20, 16:55:38 PDT] {subprocess.py:93} INFO - mountWithArgs: mountWithConn: Mount: mount: running /usr/bin/fusermount: exit status 1

Upvotes: 0

Views: 914

Answers (3)

Virendra
Virendra

Reputation: 2553

It is not possible to mount another bucket in Google Cloud Composer's Airflow environment. Confirmed this with Google support.

So workout for this was to copy the files I needed to the bucket where all the Airflow data (DAGS etcs) are and use that as the local filesystem.

Upvotes: 0

Prince Kumar
Prince Kumar

Reputation: 1

This issue is common for a fuse based system when the container is run in unprivileged mode.

See https://github.com/s3fs-fuse/s3fs-fuse/issues/647#issuecomment-330398877.

I was facing similar problem while mounting gcsfuse in a docker container. Running the container with the --privileged flag resolved the issue for me.

Therefore, it is possible that airflow is running the container in an unprivileged mode. If this is the case, the issue can be resolved by running the container with the --privileged flag.

Upvotes: 0

Tulsi Shah
Tulsi Shah

Reputation: 83

It seems like the issue is not from the gcsfuse side but the issue with installation with the fuse. Can you please try this solution https://forum.odroid.com/viewtopic.php?p=314535&sid=5decaed4623a9aa6c71619ac677d3bf2#p314535

Upvotes: 0

Related Questions