Aditya
Aditya

Reputation: 133

Does snowflake always use a staging area?

I am new to Snowflake, just want to understand the data loading in Snowflake.

Let's say I have some files in Azure ADLS or Amazon S3. I use python to read those files, do some transformation and load the data to a Snowflake table using Pandas 'to_sql' function. Does Snowflake use any staging area implicitly to load the data first and then move the data to the table? or does it directly load the data to the table?

Upvotes: 1

Views: 456

Answers (1)

Lukasz Szozda
Lukasz Szozda

Reputation: 176214

The easiest way to figure out what actual query is actulay executed is checking the QUERY_HISTORY


Writing Data from a Pandas DataFrame to a Snowflake Database

To write data from a Pandas DataFrame to a Snowflake database, do one of the following:

  • Call the write_pandas() function.
  • Call the pandas.DataFrame.to_sql() method (see the Pandas documentation), and specify pd_writer() as the method to use to insert the data into the database.

pd_writer(parameters...):

Purpose: pd_writer is an insertion method for inserting data into a Snowflake database.

When calling pandas.DataFrame.to_sql (see the Pandas documentation), pass in method=pd_writer to specify that you want to use pd_writer as the method for inserting data. (You do not need to call pd_writer from your own code. The to_sql method calls pd_writer and supplies the input parameters needed.)

The pd_writer function uses the write_pandas() function to write the data in the DataFrame to the Snowflake database.

and finally write_pandas(parameters...):

Writes a Pandas DataFrame to a table in a Snowflake database.

To write the data to the table, the function saves the data to Parquet files, uses the PUT command to upload these files to a temporary stage, and uses the COPY INTO command to copy the data from the files to the table. You can use some of the function parameters to control how the PUT and COPY INTO statements are executed.

Upvotes: 1

Related Questions