How to save a spark DataFrame back into a Google BigQuery project using pyspark?

I am loading a dataset from BigQuery and after some transformations, I'd like to save the transformed DataFrame back into BigQuery. Is there a way of doing this?

This is how I am loading the data:

df = spark.read \
  .format('bigquery') \
  .option('table', 'publicdata.samples.shakespeare') \
  .load()

Some transformations:

 df_new = df.select("word")

And this I how I am trying to save the data as a new table into my project area:

df_new \
.write \
.mode('overwrite') \
.format('bigquery') \
.save('my_project.some_schema.df_new_table')

Is this even possible? Is there a way to save to BQ directly?

ps: I know this works but this is not exactly what I am looking for:

df_new \
.write \
.mode('overwrite') \
.format('csv') \
.save('gs://my_bucket/df_new.csv')

Thanks!

Upvotes: 8

Answers (3)

user28877924

Reputation: 1

As you mentioned, data will not be written to BigQuery directly. It will first write into Google Storage and then gets loaded to BigQuery. To achieve this, use the following statement before the write statement

bucket = "<give your bucket name"
spark.conf.set("temporaryGcsBucket",bucket)
wordCountDf.write.format('bigquery').option('table', 'projectname.dataset.table_name').save()

Upvotes: 0

Nathan Nasser

Reputation: 1004

Here is the documentation for the BigQuery connector with Spark

This is how it's recommended:

# Saving the data to BigQuery
word_count.write.format('bigquery') \
  .option('table', 'wordcount_dataset.wordcount_output') \
  .save()

You set the table in the option() instead of the "save()"

Upvotes: 5

Navaneetha krishnan

Reputation: 21

Following syntax will create/overite table

         df.write.format('bigquery').option('table', ( 'project.db.tablename')).mode("overwrite").save()

For more information you can refer the following link https://dbmstutorials.com/pyspark/spark-dataframe-write-modes.html

Upvotes: 0

How to save a spark DataFrame back into a Google BigQuery project using pyspark?

Answers (3)

Related Questions