jpd527
jpd527

Reputation: 1551

How to quickly export data from R to SQL Server

The standard RODBC package's sqlSave function even as a single INSERT statement (parameter fast = TRUE) is terribly slow for large amounts of data due to non-minimal loading. How would I write data to my SQL server with minimal logging so it writes much more quickly?

Currently trying:

toSQL = data.frame(...);
sqlSave(channel,toSQL,tablename="Table1",rownames=FALSE,colnames=FALSE,safer=FALSE,fast=TRUE);

Upvotes: 20

Views: 44271

Answers (2)

Alex Skorokhod
Alex Skorokhod

Reputation: 540

I completely agree that BULK INSERT is the right option for any data which are non-tiny. However in case you need to add 2-3 lines of e.g. debug message it BULK INSERT seems to be an overkill.

The answer to your question would be a DBI::dbWriteTable() function. Example below (I am connecting my R code to AWS RDS instance of MS SQL Express):

library(DBI)
library(RJDBC)
library(tidyverse)

# Specify where you driver lives
drv <- JDBC(
  "com.microsoft.sqlserver.jdbc.SQLServerDriver",
  "c:/R/SQL/sqljdbc42.jar") 

# Connect to AWS RDS instance
conn <- drv %>%
  dbConnect(
    host = "jdbc:sqlserver://xxx.ccgqenhjdi18.ap-southeast-2.rds.amazonaws.com",
    user = "xxx",
    password = "********",
    port = 1433,
    dbname= "qlik")

if(0) { # check what the conn object has access to
  queryResults <- conn %>%
    dbGetQuery("select * from information_schema.tables")
}

# Create test data
example_data <- data.frame(animal=c("dog", "cat", "sea cucumber", "sea urchin"),
                           feel=c("furry", "furry", "squishy", "spiny"),
                           weight=c(45, 8, 1.1, 0.8))
# Works in 20ms in my case
system.time(
  conn %>% dbWriteTable(
    "qlik.export.test",
    example_data
  )
)

# Let us see if we see the exported results
conn %>% dbGetQuery("select * FROM qlik.export.test")

# Let's clean the mess and force-close connection at the end of the process
conn %>% dbDisconnect()

It works pretty fast for small amount of data transferred and seems rather elegant if you want data.frame -> SQL table solution.

Enjoy!

Upvotes: 0

jpd527
jpd527

Reputation: 1551

By writing the data to a CSV locally and then using a BULK INSERT (not readily available as a prebuilt function akin to sqlSave), the data can be written to the MS SQL Server very quickly.

toSQL = data.frame(...);
write.table(toSQL,"C:\\export\\filename.txt",quote=FALSE,sep=",",row.names=FALSE,col.names=FALSE,append=FALSE);
    sqlQuery(channel,"BULK
                INSERT Yada.dbo.yada
                FROM '\\\\<server-that-SQL-server-can-see>\\export\\filename.txt'
                WITH
                (
                FIELDTERMINATOR = ',',
                ROWTERMINATOR = '\\n'
                )");

SQL Server must have permission to access the network folder holding the CSV file, or else this process will not work. While it takes some setup with various permissions (the network folder and BULK ADMIN privileges, the reward in speed is infinitely more valuable).

Upvotes: 33

Related Questions