user8617180
user8617180

Reputation: 277

Check if Google storage bucket or file exists using Spark Scala

I want to check if a Google storage bucket exists using spark-scala. If it doesn't exist, create it.

Can somebody help?

Upvotes: 2

Views: 4264

Answers (2)

aletts54
aletts54

Reputation: 11

You can check if a given google cloud storage exists, then if it does not exist create an empty dataframe and overwrite the path plus a folder so spark creates a bucket with a folder inside and then make spark delete the folder.

I made it with PySpark but you can easily translate it to Scala with the help of the following question.

p = spark._jvm.org.apache.hadoop.fs.Path(path)
fs = p.getFileSystem(spark._jsc.hadoopConfiguration())

if fs.exists(p) == False:
    df = #Create an empty dataframe
    df.write.mode("overwrite").parquet("gs://...../folder")
    fs.delete("gs://..../folder", True)

Upvotes: 0

Pawel Czuczwara
Pawel Czuczwara

Reputation: 1520

To access Google Cloud storage bucket, use Google Cloud Client libraries:

  1. To check if bucket exist, use get_bucket method and it is also possible to access bucket metadata
  2. To create new bucket, use create_bucket method

Please note, that in order to read/write to the bucket, there is a need to set proper permissions for the storage


Upvotes: 1

Related Questions