Hope
Hope

Reputation: 119

How to append table in SQL using R?

I am now using odbc/dbplyr R packages to control database. Though I know I could use the function union to bind two tables by row, I don't know how to append one table to another without loading the data into the local memory. Is there any solutions?

Thanks.

Upvotes: 1

Views: 1011

Answers (1)

Simon.S.A.
Simon.S.A.

Reputation: 6931

I approach this using a combination of dbplyr and DBI packages.

The idea is to insert a table query into an insert statement. The insert statement can then be executed on the database:

append_database_table <- function(db_connection, db, schema, tbl_name, table_to_append) {

  query <- glue::glue("INSERT INTO {db}.{schema}.{tbl_name}\n{dbplyr::sql_render(table_to_append)}")

  # execute
  DBI::dbExecute(db_connection, as.character(query))
}

Notes:

  • This all assumes that the table you wish to append to already exists in the database. Different syntax is needed if you are creating a new table.
  • dbplyr::sql_render returns the query that defines the input table
  • query is a text string of valid database code
  • I am using SQL Server syntax. You may need to adjust this if you are using a different flavor of SQL.
  • The above is a minimal example, the full function I use for this contains additional checks and ensures column names are correct. You can find it here along with the other support functions I use for dbplyr work.

This approach works even if your input table has some processing. Consider the following example:

prepared_table = remote_table %>%
  filter(!is.na(col1)) %>%
  group_by(col1) %>%
  summarise(col2sum = sum(col2))

show_query(prepared_table)

This will return SQL code equivalent to:

SELECT col1, SUM(col2) AS col2sum
FROM remote_table
WHERE col1 IS NOT NULL

Appending this using:

append_database_table(db_connection, "db", "my_schema", "existing_table", prepared_table)

Will create the following SQL query and execute it on the database:

INSERT INTO db.my_schema.existing_table
SELECT col1, SUM(col2) AS col2sum
FROM remote_table
WHERE col1 IS NOT NULL

Upvotes: 1

Related Questions